jrollans.com is a Fediverse instance that uses the ActivityPub protocol. In other words, users at this host can communicate with people that use software like Mastodon, Pleroma, Friendica, etc. all around the world.

This server runs the snac software and there is no automatic sign-up process.

Site description
These are the voyag... uh, things I post about.
Admin email
jrollans@gmail.com
Admin account
@jrollans@jrollans.com

Search results for tag #snac2

AodeRelay boosted

[?]Moving Head Ministries » 🌐
@brazeL@norden.social

Letztens lief der Link von an mir vorbei. Eine in C geschriebene Instanz die genau das tut was sie soll: Deine Daten auf Deinem Server halten, klein und sparsam. Für , , .Kein SIgnup, limitiertes, funktionelles Design (hello 90s), kein , aber alle Möglichkeiten und komplette Doku in Manpages, Unterset von , fast beliebige Textlänge und ein anpassbares style.css. Zum selbstkompilieren.

codeberg.org/grunfink/snac2

    [?]Ltning » 🌐
    @ltning@weirdr.net

    So thanks to @stefano@bsd.cafe and his post[1] about caching JSON data for using HAProxy, I've gone ahead and implemented the same cache in Nginx on this old machine. Assuming it works, I'll do the same for my poor 486 at home..and probably post a cleaned-up config somewhere. :)

    But first, let's see how this post fares. Here we go ...

    [1] https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/

      AodeRelay boosted

      [?]AndiS 🌞🍷🇪🇺 » 🌐
      @andi@snac.sonnenmulde.at

      Selfhosting snac is great and easy. Enjoy!

      You can't remain the Great Emperor of Mastodon though! How about "King of my Snacle"? 😉

      CC: @stefano@bsd.cafe

        AodeRelay boosted

        [?]IT Notes - https://it-notes.dragas.net » 🤖 🌐
        @itnotes@snac.it-notes.dragas.net

        Aggressive caching for a Mastodon reverse proxy: what to cache, what to never cache, and why content negotiation will eventually betray you

        I have written before about putting a cache in front of snac (https://it-notes.dragas.net/2025/01/29/improving-snac-performance-with-nginx-proxy-cache/), and more recently about the HAProxy layer in front of FediMeteo (https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/). The general idea is always the same: the reverse proxy should absorb the repetitive, public work that has no business reaching the application server.

        This post is the same idea applied to a much louder neighbour: a Mastodon instance. The instance is mastodon.bsd.cafe (https://mastodon.bsd.cafe/), the proxy is nginx on FreeBSD, and the configuration below is what I am currently running in production.

        Mastodon is heavier than snac in every direction. It has Puma and Sidekiq behind it, more endpoints, more streaming, more federation patterns, and one specific characteristic that complicates everything: it serves multiple representations on the same URLs. The same path returns HTML to a browser, ActivityPub JSON to a remote instance, and sometimes plain JSON to an API client. If the proxy treats the URL as one thing, sooner or later it will return the wrong thing to the wrong client.

        Most of the work below comes from that single observation.

        If I had to summarize this whole post in a single sentence, it would be this:

        Mastodon is not a website. It is a website, an API, and an ActivityPub server, all sharing the same URLs.

        Everything else in this configuration - cache keys, variants, bypass rules, the diagnostic headers - is decoration around that one fact.

        A popular toot from a friend gets boosted. Twenty federated instances ask for the same ActivityPub object within the same second. Browsers fetch the HTML version of the same URL. If the proxy sees only "a URL", it will eventually betray you: a remote instance will receive HTML, a browser will receive ActivityPub JSON, and you will spend an afternoon wondering why your timeline looks broken on three different servers. I have spent that afternoon. I do not recommend it.

        Assumptions before anything else

        Before any directive, this configuration assumes a few things about the instance. If any of these does not match your setup, the directives still make sense, but you must read the caveats at the end before adapting them.

        The first assumption is that AUTHORIZED_FETCH (secure mode) is disabled. With secure mode off, all ActivityPub GET responses cached at the proxy layer are public and identical regardless of the requesting actor. With secure mode on, Mastodon can legitimately return different bodies depending on which remote actor is asking, and caching them blindly at the proxy becomes at best wasteful, at worst a cache-poisoning surface.

        This is not a hypothetical. CVE-2026-25540 (https://nvd.nist.gov/vuln/detail/CVE-2026-25540), fixed in Mastodon 4.3.19, 4.4.13, and 4.5.6, is exactly this kind of mistake, but inside Mastodon's own Rails.cache: the pinned posts and featured hashtags endpoints had actor-dependent ActivityPub responses but were keyed without the actor. The CVE does not directly apply to nginx caches, but the underlying lesson does. Do not cache what depends on the caller unless the caller is part of the cache key. Keep this rule in mind every time you are tempted to cache a federation endpoint "just in case".

        The second assumption is that no signed-URL storage backend sits behind /system/ or /media_proxy/. If those paths ever redirect to short-lived presigned S3 or SeaweedFS URLs, my TTLs below are too long: nginx will happily cache a redirect to a URL that has already expired.

        The third assumption is that federation traffic uses HTTP Signatures, not the HTTP Authorization header. Mastodon signs federated GETs with the Signature header. The Authorization-based skip-cache rule further down catches API tokens, not signed federation traffic. If you enable AUTHORIZED_FETCH, you must add an explicit skip rule for $http_signature.

        I am being deliberate about these assumptions because the configuration that follows is internally consistent only as long as they hold.

        The proxy in front of mastodon.bsd.cafe has three jobs:

        TLS termination, microcaching of expensive endpoints (especially federation-heavy collections and default public routes), and long-lived caching of immutable assets and user media.

        The point is not to replace Mastodon's internal Rails cache. The point is to absorb spiky federation traffic and repetitive asset fetches that would otherwise hit Puma and Rails for every single request.

        The strategy is deliberately layered: very long TTL on fingerprinted assets, medium TTL on user-uploaded media, very short microcache on dynamic pages and federation endpoints that get hammered, and explicit bypass rules for anything private, authenticated, actor-dependent, or otherwise unsafe.

        Every cacheable layer is keyed correctly for content negotiation. That is the part that matters most.

        The cache zone

        A single cache zone is shared across all Mastodon locations:

        proxy_cache_path /var/cache/nginx/mastodon
        levels=1:2
        keys_zone=mastodon_cache:200m
        max_size=20g
        inactive=24h
        use_temp_path=off;
        200m of keys zone holds metadata for roughly 1.6 million entries in RAM. The body can grow up to 20g on disk. The two numbers are independent: keys live in shared memory, bodies live on the filesystem, and the cache key is what links them.

        inactive=24h evicts anything not requested for a day, even if there is free space. This is intentional. I do not want a long, cold tail of stale entries to squat in the cache forever. I want the working set to remain hot, and I want the rest to fade.

        use_temp_path=off is small but important. By default nginx writes a cached response to a temporary file and then renames it into place. If the temp path and cache path are on different filesystems, that cheap rename becomes a real copy. Setting use_temp_path=off puts temporary files directly under the cache directory and avoids that trap. It is the kind of detail nobody mentions until something is suspiciously slow.

        Of all the maps in this configuration, only one really earns its place. This one:

        map $http_accept $mastodon_cache_variant {
        default "default";
        "~*application/activity\+json" "activitypub";
        "~*application/ld\+json" "activitypub";
        "~*application/json" "json";
        "~*text/html" "html";
        }
        Mastodon serves the same URL with different bodies depending on the Accept header. A status URL like /@user/123456789 returns rendered HTML to a browser and an ActivityPub object to another federated instance. If you cache by URL alone, the first request that comes in wins and the next request receives the wrong content type. Instances start federating HTML, browsers start downloading JSON, and the failure is subtle enough to waste hours.

        The map normalizes Accept into four buckets - activitypub, json, html, and default - and the result is folded into the cache key in every location that does content negotiation:

        proxy_cache_key "$scheme$host$request_uri|accept=$mastodon_cache_variant";
        Coalescing equivalent MIME types is intentional. application/activity+json and application/ld+json both map to activitypub, because splitting them across two cache buckets would fragment the cache for no useful operational gain.

        A subtle point I want to be explicit about: I do not include $request_method in the cache key. nginx already converts HEADinto GET for caching purposes by default, which is what I want here. A HEAD request on /@user/123 should hit the same cache entry as a GET request on the same URL. Adding the method would only separate them for no benefit.

        During rollout I also expose the selected variant as a response header:

        add_header X-Cache-Variant $mastodon_cache_variant always;
        The header is there to verify the behaviour in production. It can come off once the configuration has proved itself, but I tend to leave it on. A cache that works should be visible. A cache that is invisible can be correct, but it can also be silently wrong, and I would rather know.

        This is the first real gotcha, and I want to spend a moment on it because it caught me out the first time I configured a similar setup.

        nginx honors the upstream Vary response header in addition to proxy_cache_key. If Mastodon emits Vary: Accept, or worse, Vary: Accept, Cookie, ..., my carefully normalized variant key gets paired with nginx's native Vary handling. The result is that the cache may still fragment on the full, un-normalized Accept header - which defeats the entire point of the variant map.

        There is another, very specific failure mode on older or unpatched nginx builds. nginx stores the Vary value in a fixed-size cache metadata field. Historically that field was 42 bytes, which is famously short and almost charmingly suspicious of being a Douglas Adams reference. Modern nginx raised the limit to 128 bytes, which is enough for the common cases but still surprisingly small. If your upstream emits a long Vary header, anything beyond the limit is treated as Vary: *, which means the response is not cached at all. The only signal you get is a critical line in the error log, and unless you are looking for it, you will not see it.

        The operational lesson is the same in both cases: if you rely on your own normalized variant key, do not assume upstream Vary is harmless. Check your nginx version, check your error log, and verify cache behaviour via X-Cache-Statusand X-Cache-Variant.

        On the locations where the variant map is the cache dimension I care about, I take responsibility explicitly:

        proxy_ignore_headers Vary;
        This tells nginx to stop using upstream Vary to protect me. That is fine only if my own cache key and request normalization cover every response dimension that matters. In particular, I make sure the backend is not also varying on Accept-Encoding in a way that would create compressed and uncompressed variants behind my back. The cleanest way to avoid that is not to forward Accept-Encoding to the backend at all, and let frontend nginx handle compression itself:

        proxy_set_header Accept-Encoding "";
        This is the kind of decision I prefer to be explicit about. Ignoring Vary is not magic. It is a responsibility, and it should be paired with the rules that take its place.

        Rather than build one giant boolean to decide what bypasses cache, I prefer to decompose the logic into small orthogonal maps. Each map is 1 when caching must be skipped, and the final decision is an OR of all of them.

        map $request_method $skip_cache_method {
        default 1;
        GET 0;
        HEAD 0;
        }

        map $http_authorization $skip_cache_auth {
        default 1;
        "" 0;
        }

        map $http_cookie $skip_cache_cookie {
        default 1;
        "" 0;
        }

        map $uri $skip_cache_uri {
        default 0;
        ~^/auth 1;
        ~^/oauth 1;
        ~^/settings 1;
        ~^/admin 1;
        ~^/api/v1/custom_emojis$ 0;
        ~^/api/v1/instance$ 0;
        ~^/api/v2/instance$ 0;
        ~^/api/v1/trends/tags$ 0;
        ~^/api/oembed$ 0;
        ~^/api/ 1;
        }

        The reasoning is straightforward. Only GET and HEAD are cacheable; everything else, including POST, DELETE, PUT, and ActivityPub deliveries, must pass through. Any request carrying an Authorization header is an API call with a token, and those are never public. Any request with a cookie is potentially logged-in traffic, and caching logged-in pages would leak personal timelines across users. Auth flows, settings, admin, and most of the API bypass the cache by URI, while a small, carefully chosen set of slow-changing public API endpoints is allowed through.

        The important caveat I want to underline: the Authorization map does not catch signed federated GETs. Mastodon federation uses HTTP Signatures, which means the relevant request header is Signature. If AUTHORIZED_FETCH is enabled, you must add a parallel map:

        map $http_signature $skip_cache_signature {
        default 1;
        "" 0;
        }
        and then include it in both proxy_cache_bypass and proxy_no_cache. Do this before enabling secure mode, not after.

        The maps are used together in each cacheable location:

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        Both directives are necessary. proxy_cache_bypass means "do not read from cache for this request". proxy_no_cache means "do not write this response to cache". Without proxy_no_cache, a logged-in user's response could still poison the anonymous cache. Without proxy_cache_bypass, a request that should have gone straight to the backend might still receive a cached anonymous response. I keep both, every time.

        Most locations share a common proxy baseline. There is nothing clever here, but if any of these lines is missing the rest of the configuration quietly does less than expected.

        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Connection "";
        proxy_set_header Accept-Encoding "";
        proxy_http_version 1.1 and proxy_set_header Connection "" matter for upstream keepalive. Without them, nginx may use HTTP/1.0 semantics upstream and send Connection: close on every request, which makes the keepalive directive on the upstream block far less useful than it looks.

        proxy_set_header Accept-Encoding "" keeps backend responses uncompressed so nginx can cache a single representation and handle client-facing compression itself. It also prevents accidental cache fragmentation through Vary: Accept-Encoding, which would otherwise creep in despite the variant map.

        These settings are not exciting, and they should not be. The interesting parts of an infrastructure are not always the parts that should be unusual.

        The Mastodon server block in my configuration ends up with seven distinct request profiles. Six of them cache; one explicitly does not, because streaming is not a cacheable workload.

        I do not group them under one location / with a giant if block. I prefer to keep each profile in its own location, even if some of them look similar. When something goes wrong in production, I want to be able to point at one location and reason about it without holding the rest of the configuration in my head.

        location ~ ^/(assets|packs|emoji)/ {
        proxy_cache mastodon_cache;
        proxy_cache_key "$scheme$host$request_uri";
        proxy_ignore_headers Vary;

        proxy_cache_valid 200 301 302 7d;
        proxy_cache_valid 404 10m;

        proxy_cache_lock on;
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;

        proxy_pass http://$custom_upstream;
        }

        These paths are content-addressed. Webpack fingerprints filenames with hashes, so a new deploy publishes new URLs while the old URLs remain valid. A 7-day TTL is safe because /packs/js/common-abc123.js will never become different content under the same URL. If it does, it has a new URL.

        404s get a short 10-minute TTL so a temporarily missing asset can recover quickly.

        proxy_cache_lock on is the thundering-herd guard. When a popular asset is not cached and ten clients ask for it at once, nine wait for the first request to populate the cache instead of all ten hammering the backend. I like this directive a lot. It is the kind of small switch that quietly removes a class of problems.

        proxy_cache_use_stale together with proxy_cache_background_update is the stale-while-revalidate pattern. If an entry has expired but Mastodon is slow or briefly down, nginx can serve the stale copy and refresh it asynchronously. For static assets this is almost always the right trade-off. The asset has not actually changed under the same URL, and a few extra hours of stale data hurt nobody.

        location ~ ^/system/(accounts/avatars|media_attachments/files|custom_emojis/images)/ {
        proxy_cache mastodon_cache;
        proxy_cache_key "$scheme$host$request_uri";
        proxy_ignore_headers Vary;

        proxy_cache_valid 200 302 6h;
        proxy_cache_valid 404 5m;

        proxy_cache_lock on;
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;

        proxy_pass http://$custom_upstream;
        }

        Avatars, attachment thumbnails, and custom emoji are also effectively content-addressed, because the file path contains an ID. They can still be replaced or deleted, so the TTL is more conservative than for assets: six hours instead of seven days.

        The 302 status is also cached, because Mastodon may redirect to another storage location, and the redirect is usually stable enough to cache for hours.

        This is also where the caveat about signed URLs really matters. If you ever put a signed-URL backend behind /system/, this TTL must be shorter than the signed URL lifetime, or nginx will eventually serve a redirect to a URL that no longer works. On mastodon.bsd.cafe I do not use signed URLs, so six hours is fine.

        location ~ ^/(users|ap/users)/[^/]+/statuses/[0-9]+/replies {
        proxy_cache mastodon_cache;
        proxy_cache_key "$scheme$host$request_uri|accept=$mastodon_cache_variant";
        proxy_ignore_headers Vary;

        proxy_cache_valid 200 30s;
        proxy_cache_valid 404 10s;

        proxy_cache_lock on;
        proxy_cache_lock_timeout 1s;
        proxy_cache_lock_age 5s;

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;

        proxy_pass http://$custom_upstream;
        }

        This is the location I tuned most carefully. When a status starts going viral, dozens of federated instances poll /replies to build their thread view, often within the same second. The same URL must serve an HTML thread view to browsers and an ActivityPub OrderedCollection to remote instances, so the variant key is essential here.

        A 30-second microcache absorbs the spike without serving meaningfully stale data. A reply that appears 30 seconds late in a federated thread is usually invisible to humans, while the backend relief is very visible.

        The lock settings keep backend load and latency bounded. proxy_cache_lock_timeout 1s bounds how long queued requests wait behind the lock. If the timeout expires, they go to the upstream directly, but their responses are not stored in the cache, which prevents a runaway thundering herd from clogging the cache fill path. proxy_cache_lock_age 5s prevents one slow cache-populating request from monopolizing the fill path forever; if the request holding the lock has not completed after 5 seconds, nginx may let another request reach the upstream to retry.

        I have currently left proxy_cache_use_stale off on this location while I am still validating the deployment. This is a deliberate debugging stance, not a permanent choice. Stale-while-revalidate is useful in production, but during rollout it can hide upstream issues while I am trying to understand the system. Once the behaviour is stable, the production version will be:

        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;
        location ^~ /media_proxy/ {
        proxy_cache mastodon_cache;
        proxy_cache_key "$scheme$host$request_uri";

        proxy_cache_valid 200 10m;
        proxy_cache_valid 301 302 10m;
        proxy_cache_valid 404 1m;

        proxy_ignore_headers Cache-Control Expires Vary;

        proxy_cache_lock on;
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;

        proxy_pass http://$custom_upstream;
        }

        Mastodon's /media_proxy/ fetches remote media so clients do not leak their IP address to remote servers. The response is the same regardless of Accept, so the cache key intentionally omits the variant. Splitting media proxy responses across html, json, activitypub, and default buckets would only waste storage.

        proxy_ignore_headers Cache-Control Expires Vary is deliberate here. Mastodon may emit conservative cache headers, or none at all, and I want the proxy to enforce a short local 10-minute policy regardless of what the backend says.

        Set-Cookie is not in the ignore list. nginx's default refusal to cache responses carrying Set-Cookie still applies, and I want it to. It is a safety net I do not want to disable just to win a few cache hits.

        The ^~ prefix is a small useful detail. Once this location matches, nginx stops evaluating regex locations. Media proxy traffic can be heavy, and skipping further regex matching is a tiny but free win.

        location ~ ^/(users|ap/users)/[^/]+/(followers|following) {
        proxy_pass http://$custom_upstream;
        }
        This one is a pure proxy, no cache. I want to be explicit that this is a decision, not an omission.

        /users//followers and /users//following are pagination-heavy, change frequently as people follow and unfollow, and are queried by federation crawlers in ways that would make the cache key proliferate through pages and cursors. The likely hit ratio is poor, the risk of serving stale social graph data is non-trivial, and the cost of caching them - in storage and in mental overhead - is not worth it.

        If a remote instance starts hammering these endpoints, the right answer is rate limiting with limit_req_zone, not retrofitting cache as a rate limiter.

        Default location: the microcache and streaming without cache

        location / {
        proxy_cache mastodon_cache;
        proxy_cache_key "$scheme$host$request_uri|accept=$mastodon_cache_variant";
        proxy_ignore_headers Vary;

        proxy_cache_valid 200 10s;
        proxy_cache_valid 301 302 1m;
        proxy_cache_valid 404 10s;

        proxy_cache_lock on;
        proxy_cache_lock_timeout 5s;

        proxy_cache_bypass $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;
        proxy_no_cache $skip_cache_method $skip_cache_auth $skip_cache_cookie $skip_cache_uri;

        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 2;

        proxy_pass http://$custom_upstream;
        }

        Everything not matched by a more specific location falls here: profiles, individual statuses, the about page, the public timeline, and many ActivityPub object fetches.

        The TTL is only 10 seconds for 200 responses. That is enough to deduplicate the wave of requests when a popular toot gets boosted or linked from elsewhere, without making the page feel stale to a human visitor.

        It is worth being honest that short TTLs still cost CPU. A 10-second microcache on a sustained-traffic URL means the backend regenerates the entry six times per minute. That is vastly better than serving every request from Rails, but it is not free. If your backend cannot comfortably handle that, raise the TTL, or enable stale-while-revalidate on these dynamic paths.

        proxy_next_upstream with proxy_next_upstream_tries 2 is the failover trigger. If the primary returns 502, 503, 504, or times out, nginx retries on the backup. The chain is capped at two attempts so a sick upstream cannot hold the request indefinitely.

        At the http level:

        map $http_upgrade $connection_upgrade {
        default upgrade;
        "" close;
        }
        In the server block:

        location /api/v1/streaming {
        proxy_buffering off;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
        proxy_pass http://$custom_upstream;
        }
        Streaming is a WebSocket and SSE-style endpoint. Buffering must be off, otherwise the proxy may hold messages while waiting for buffers to fill. The Upgrade and Connection headers are driven by $connection_upgrade, which is upgradeonly when the client actually sent an Upgrade header. That way a non-WebSocket request to the same path does not get its Connection header mangled.

        The hour-long read and send timeouts allow long-lived streams to stay open through quiet periods.

        There is no cache here. Streaming is not a cacheable workload, and trying to make it one is one of those ideas that sounds clever for about thirty seconds.

        Upstream and failover

        upstream mastodonbsdcafe {
        server 192.168.123.33 max_fails=3 fail_timeout=30s;
        server 192.168.122.133 backup;
        keepalive 64;
        }
        The primary backend is on another VPS; the backup is in a jail next to the reverse proxy. After three consecutive failures, the primary is marked down for 30 seconds. Traffic flips to the backup, then nginx retries the primary after the window.

        keepalive 64 holds up to 64 idle TCP connections to the upstream per worker. On a busy instance, this saves real handshake overhead, but only if the proxied connection can actually stay open. That is why the shared proxy settings include proxy_http_version 1.1 and proxy_set_header Connection "". Without those, upstream keepalive does much less than it looks like it should.

        I also use an indirection layer:

        map $remote_addr $custom_upstream {
        default mastodonbsdcafe;
        }
        Today everything defaults to the main upstream group. The map exists so that specific client IPs can be pinned to a specific upstream when I am debugging, or so an admin connection can be routed to the backup while the primary is being tested. It costs nothing to have it sitting there, and it has saved me time more than once.

        What I log and why

        log_format detailed '$remote_addr - $remote_user [$time_local] '
        '"$request" $status $body_bytes_sent '
        '"$http_referer" "$http_user_agent" '
        'rt=$request_time '
        'uct=$upstream_connect_time '
        'uht=$upstream_header_time '
        'urt=$upstream_response_time '
        'us=$upstream_status '
        'ua=$upstream_addr '
        'cache=$upstream_cache_status '
        'variant=$mastodon_cache_variant';

        access_log /var/log/nginx/access.mastodon.bsd.cafe.log detailed;

        add_header X-Cache-Status $upstream_cache_status always;
        add_header X-Cache-Variant $mastodon_cache_variant always;

        This log format is purpose-built for the cache layer. For each request it records total request time, upstream connect time, upstream header time, upstream response time, upstream status, which backend served the request, cache status, and which content-negotiation variant was selected.

        The cache status is one of the values nginx exposes through $upstream_cache_status: HIT, MISS, BYPASS, EXPIRED, STALE, UPDATING, or REVALIDATED. The response headers expose the same information to the client, which makes it trivial to verify behaviour with curl -I or browser dev tools.

        The always qualifier matters. Without it, nginx only adds these headers to a subset of responses, so a 502 from the backend might arrive without the diagnostic headers you need most. I want them on every response, no exceptions.

        There is also a small operational detail I find pleasant: a custom 502 page.

        error_page 502 /502.html;
        location = /502.html {
        root /usr/local/www/mastodon_errors;
        internal;
        }
        It is not part of the cache strategy, but it makes backend hiccups less ugly. And I block some abusive user agents with 444, which closes the connection without sending any response at all:

        if ($http_user_agent ~* "bytespider") {
        return 444;
        }
        This is not a general bot strategy. It is just a cheap refusal path for traffic I know I do not want.

        How I check it actually works

        A configuration that I cannot verify is a configuration I do not trust. Here is the short set of commands I keep in a paste buffer for this proxy.

        The first verification is variant separation. Three requests to the same URL with different Accept headers should produce three independent cache entries:

        for v in 'text/html' \
        'application/activity+json' \
        'application/ld+json; profile="https://www.w3.org/ns/activitystreams"'; do
        printf '%-75s -> ' "$v"
        curl -s -o /dev/null -D - -H "Accept: $v" \
        https://mastodon.bsd.cafe/@someuser/123456789 \
        | awk '/^[Xx]-[Cc]ache/ { printf "%s ", $0 } END { print "" }'
        done
        On the first pass, every variant should be a MISS. On the second pass, every variant should be a HIT, with X-Cache-Variantshowing the expected bucket.

        The second verification is that cookies and Authorization always trigger BYPASS:

        curl -I -H 'Cookie: _mastodon_session=test' \
        https://mastodon.bsd.cafe/@someuser

        curl -I -H 'Authorization: Bearer fake' \
        https://mastodon.bsd.cafe/api/v1/timelines/home

        Both should return X-Cache-Status: BYPASS. If they do not, the skip-cache rules are wrong, and the entire setup is unsafe.

        If you intend to enable AUTHORIZED_FETCH, the third verification is for signed GETs. A quick synthetic check that the nginx map fires correctly:

        curl -I -H 'Signature: fake' \
        -H 'Accept: application/activity+json' \
        https://mastodon.bsd.cafe/users/someuser
        If you added $skip_cache_signature, the result should be X-Cache-Status: BYPASS.

        Finally, the logs themselves tell me how the cache is performing in production. Cache status distribution:

        awk '{
        for (i = 1; i <= NF; i++)
        if ($i ~ /^cache=/) c[$i]++
        }
        END {
        for (k in c) print k, c[k]
        }' /var/log/nginx/access.mastodon.bsd.cafe.log
        A healthy instance shows cache=HIT and cache=BYPASS doing most of the work, with cache=MISS accounting for cold paths and short-TTL refreshes. The same trick works for the variant distribution:

        awk '{
        for (i = 1; i <= NF; i++)
        if ($i ~ /^variant=/) v[$i]++
        }
        END {
        for (k in v) print k, v[k]
        }' /var/log/nginx/access.mastodon.bsd.cafe.log
        This tells me what my traffic actually looks like. A federation-heavy instance shows a lot of activitypub. An instance with many human visitors shows more html. On mastodon.bsd.cafe the balance shifts depending on what is happening in the wider Fediverse on any given day.

        Caveats worth being honest about

        I do not like presenting configurations as magic, so I want to be explicit about the conditions under which this one is appropriate.

        Short TTLs cost CPU. A 10-second microcache on a sustained-traffic URL means six backend regenerations per minute. That is much better than no cache, but it is not free. If the backend cannot comfortably handle that, raise the TTL or enable stale-while-revalidate on the dynamic paths.

        Dynamic stale-while-revalidate is powerful but it hides problems. I currently keep proxy_cache_use_stale off on the dynamic locations because I am still validating behaviour. In steady-state production, stale-while-revalidate is usually the right choice. During rollout, it can quietly hide upstream errors and make debugging harder. Be honest with yourself about which mode you are in.

        AUTHORIZED_FETCH changes the threat model. With secure mode disabled, public ActivityPub GET responses are safe to cache as public content, provided your cache key handles content negotiation correctly. With secure mode enabled, ActivityPub responses can become actor-dependent. At that point you must either bypass cache for signed GETs or include the signing actor in the key. The latter usually destroys the hit ratio, so bypassing is the practical answer.

        The variant map is a compromise. It covers application/activity+json, application/ld+json, application/json, and text/html. Everything else falls into the default bucket. That is intentional, but the default bucket is still a bucket. If you discover a real client type that matters on your instance, add it explicitly.

        Ignoring Vary is a responsibility. proxy_ignore_headers Vary is not magic; it tells nginx to stop protecting you based on upstream Vary. That is fine only if your own cache key and request normalization cover every dimension Vary was protecting. For this configuration that means normalizing Accept into a variant, avoiding backend Accept-Encodingvariation, never caching cookies or authorization, and never caching signed GETs if secure mode is enabled.

        Followers and following are uncached on purpose. They are pagination-heavy and change frequently. Caching them would create many low-value entries with questionable freshness. If a remote instance hammers these endpoints, use limit_req_zone. Do not retrofit cache as a rate limiter.

        Signed-URL redirects require shorter TTLs. Caching 302s is useful when redirects are stable. It is dangerous when redirects point to short-lived signed URLs. If your media storage returns presigned URLs, your nginx redirect TTL must be shorter than the URL lifetime.

        Set-Cookie must remain special. Do not add Set-Cookie to proxy_ignore_headers unless you are absolutely sure the location cannot produce user-specific responses. nginx's default refusal to cache Set-Cookie responses is a safety net. Keep it.

        A good configuration is a written form of the assumptions behind a service. When the assumptions change, the configuration must change too.

        There is no single brilliant directive in this configuration. The trick is combining long TTLs for immutable assets, medium TTLs for media, tiny TTLs for dynamic public pages, cache locking for thundering-herd protection, strict bypass rules for private or actor-dependent traffic, a normalized content-negotiation key, and enough logging to prove the system is doing what I think it is.

        What this layer buys me, in one sentence: fewer requests reach Puma and Rails.

        That is the metric I care about. Mastodon is not slow, but it is heavy, and the bigger the instance grows the more it benefits from a layer in front that quietly absorbs the work that does not need to be done by the application. A reverse proxy that caches Mastodon safely has to remember, with every request, that the same URL might mean three different things to three different clients. Once it does, even a very short microcache can remove a surprising amount of load without changing the user-visible behaviour of the instance.

        https://it-notes.dragas.net/2026/06/05/aggressive_caching_for_a_mastodon_reverse_proxy/


          #snac2 boosted

          [?]Mason Loring Bliss [he, him, his] » 🌐
          @mason@partychickens.net

          I think it's fair to say that I quite dislike the Ruby environment. That said, the server's now running Mastodon 4.5.10. I still need to stand up my server again. And I'm interested in firing up as it's possible that my reading behaviours are converging with its feature set.

          Hint for folks running Mastodon: Don't try to use system packages any more than you must. You're going to end up needing to use rbenv or everything comes apart at the seams. Also, it's not enough following Mastodon's upgrade instructions - I needed to throw in the database migration instructions from glitch-soc.github.io/docs/ before the server would actually come up.

            #snac2 boosted

            [?]philip » 🌐
            @philip@social.wittamore.fr

            My bleeding edge server has been updated
            snac 2.93-dev - A simple, minimalistic ActivityPub instance
            Copyright (c) 2022 - 2026 grunfink et al. / MIT license

              AodeRelay boosted

              [?]ololduck » 🌐
              @ololduck@vit.am

              Finally migrated to by @grunfink@comam.es !

              I've been on the fence about hosting my own account for a while now, mainly because of the insane hardware requirements.

              I've considered pleroma/akkoma for a while, but i don't know the first thing about elixir and their docs are not the most clear. I've been really tempted by , but 's easy install & general simplicity made me take the plunge.

                #snac2 boosted

                [?]lobster » 🌐
                @lobster@defcon.social

                Dear Friends of ,

                - Just installed 64bit Full on a thumb-drive. Seemed to work OK
                - Built for Grecian's it is designed for those trying to build an updated computer based on the Antikythera mechanism, analytical engine, abacus or similar. Mmm... now that is a citizen project worth pursuing and miniaturising
                - Meanwhile I am trying to download to run my own -drive /#server on it. Grateful for any help. Otherwise may have to use

                In other news, my one comment (the first one for their 25th years online edition) was removed by . Are they being what is the word... 'fascistic'?

                Power to the People. Not all the people just the selected... Eh wait. Think it might be time for the real revolution...

                  #snac2 boosted

                  [?]lobster » 🌐
                  @lobster@defcon.social

                  Hurrah as we say in ye olde England,

                  Managed to use EasyOS Excalibur-series version 7.3.7 and its 'Youtube downloader' to download an introduction by your favourite lobster (moi hopefully) to my very own Time Travel Linux (I created aprox five with names like ' Tmxxine shard' 'Tmxxine prism' and this early one which is based on 'Puppy Lucid'. 'Lucid Tmxxine'. Tmxxine is pronounced Time Machine in case you are in the wrong time line.

                  I also tried to find out about a FreeBSD program called which allows one to set up a small personal server on a thumb drive BUT that is a story for another time...

                  Power to the People of all times and dimensions!

                  Alt...Yi ha! Back in ye olde days I created several Linux versions for time travellers. This is one called Lucid Tmxxine. Containing programs such as Corel Draw (open source version at the time) VLC, Skype (now confiscated by Microsoft), Openshot etc.

                    [?]IT Notes - https://it-notes.dragas.net » 🤖 🌐
                    @itnotes@snac.it-notes.dragas.net

                    FediMeteo, timezones, and the art of not breaking what already works

                    I have already written about how FediMeteo was born (https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-service-for-thousands/), and about how HAProxy helps reduce the number of requests that reach snac (https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/).

                    Seen from the outside, FediMeteo almost seems still. There is a static homepage, regenerated every hour. There are the city pages, with their forecasts. There are RSS feeds waiting to be fetched, JSON objects waiting to be requested, Fediverse instances refreshing data, subscribing, unsubscribing, retrieving profiles, and reading notes.

                    That is the visible part.

                    Behind it, however, FediMeteo (https://fedimeteo.com) is much more than a homepage, a few ActivityPub accounts, and a well-behaved reverse proxy. It is a chain of small pieces, in proper Unix style, each trying to do one thing and do it as well as possible.

                    That chain, although almost invisible from the outside, was not born already tidy. It changed, was rewritten, adapted to new countries, timezones, ambiguous city names, external service limits, and also to my own mistakes.

                    Some mistakes were small. Others were much less so.

                    Because FediMeteo is a human project and, as such, imperfect. Imperfect in the way humans are imperfect, which today almost seems unfashionable. I like that.

                    The first version of the bot was almost embarrassingly simple, and I was proud of that.

                    It took a city name as input, asked Nominatim (https://nominatim.org) for the coordinates through geopy, called the Open-Meteo (https://open-meteo.com) API for the current weather and the next several days, and printed a markdown block with current conditions, the forecast for today, the next twelve hours, and the coming days. The text was in Italian. The cities were Italian. The timezone was Europe/Rome. There was nothing to calculate.

                    Around the script, a small sh wrapper read a list of cities and, for each one, ran the Python program and piped its output into snac note_unlisted. A cron job ran the wrapper every six hours. The output was loose markdown, which snac happily renders, and the integration was: standard output goes into standard input. Nothing fancier than that.

                    I like this kind of design. It is the part of the Unix philosophy that survives even when fashions change.

                    When I started adding other European countries, I did not need to change much. I separated the operational logic from the localized strings, moved the strings into one JSON file per country, and spread the cron entries so that not every country posted in the same minute. Each country had its own snac instance, in its own FreeBSD jail, with its own dataset. The bot, internally, was almost the same script as before.

                    This worked because Europe is, in essence, two or three timezones across most of the countries I cared about.

                    Then I added Germany, and Germany taught me my first lesson about names.

                    There are several places called Neustadt in Germany. There is a Frankfurt am Main, and a Frankfurt an der Oder, and they are not the same city. There is a Halle in Saxony-Anhalt and a Halle in North Rhine-Westphalia. Asking Nominatim for "Frankfurt, Germany" produced one of the two, consistently, but not always the one I wanted. Some German users wrote to me, politely, to point out that the forecast for "their" Frankfurt was, in fact, for the other one.

                    I started thinking about disambiguation, but only enough to fix the immediate cases. The bot still took a single city name. The ambiguous ones I worked around by editing the cities file and hoping for the best.

                    In hindsight, this was the seed of what would happen later.

                    The United States broke every assumption the bot had grown up with.

                    The first problem was the number of cities. I wanted reasonable coverage at state level, which meant identifying the main cities for each of the fifty states. The list ended up at more than 1200 entries. That alone is more cities than every other country in the project combined.

                    The second problem was timezones. The contiguous United States covers four of them, and Alaska and Hawaii bring the total to six. A "current weather at 12:00" line generated at the same instant for New York and for Los Angeles is technically the same instant, but the two cities are living different parts of the day, and the forecast for "today" is not even quite the same window. A bot that pretended every city was on the same clock would be wrong, sometimes embarrassingly so, every single day.

                    The third problem was the name thing again, only larger. There are dozens of Springfields. There is a Portland in Oregon and a Portland in Maine. The Germany workaround - editing the cities file by hand and hoping Nominatim picked the right city - was clearly not going to scale to a country where the same name is also a state.

                    I sat with this for a couple of days before admitting what I already knew.

                    The bot needed to be rewritten.

                    What made this hard was not the rewriting itself. It was the requirement to do it without breaking everything else.

                    By the time I decided to add the United States, the infrastructure around the bot had grown into something I trusted. Jails, snapshots, backup jobs, cron schedules, snac instances on production paths, the HAProxy layer, the homepage cron that aggregated follower counts, and a long list of cities being processed in series every six hours. None of that knew or cared about the bot's internal shape. All of it cared, very much, about the bot's external behavior: a city name and a country code go in, valid markdown comes out, and that markdown ends up in a timeline.

                    So the contract was clear, even if I had never written it down anywhere. The command-line interface, the output format, the exit codes, the way the wrapper script invoked it, the structure of the JSON country configs - all of it had to keep working. Italian had to keep working. German had to keep working. The cron job that ran every six hours had to keep producing the same shape of output, just with new countries added.

                    What I changed was almost everything below the surface.

                    The city argument grew an optional __state suffix, with a double underscore as separator:

                    python3 main.py springfield__illinois us
                    python3 main.py springfield__massachusetts us
                    python3 main.py new_york__new_york us
                    A city without the suffix continued to work exactly as before, which is what every European country needed. The country config gained a timezone field that could be a fixed string or the literal "auto"; when it was "auto", the bot used timezonefinder against the resolved coordinates to determine the right zone for that specific city. Internally I separated the weather provider behind an interface, so Open-Meteo could remain the primary while MET Norway and wttr.in sat behind as alternatives, with automatic fallback when the primary failed. Units became configurable per country: temperature, wind speed, precipitation. The United States needed Fahrenheit, miles per hour, and inches. Most of Europe wanted Celsius, kilometers per hour, and millimeters. The bot now does either, on a per-country basis, without caring which is which.

                    I am skipping a lot of small detail here, but the principle was always the same: every new degree of freedom had to be expressible as an optional field in the config or as an optional CLI flag. If a country did not set the new field, the old behavior continued, identical to before.

                    I tested this by running the new bot against the old country configs and comparing the output line by line. Where it differed, it was a bug in the new bot. Not in the test.

                    The first cycle after deploying the rewrite was, for every country except the United States, indistinguishable from the cycle before. That was the point.

                    This is the part of the story I dislike telling, which is precisely why I should tell it.

                    At some point during the development, while debugging an Open-Meteo response that did not look right, I added a print statement to the error path that dumped the full request URL whenever something went wrong. The full URL of the Open-Meteo customer endpoint includes the apikey query parameter. The print was meant for development. I forgot to remove it.

                    I deployed.

                    The next time Open-Meteo had an outage - and small ones happen, sometimes for several minutes at a time - the bot dutifully printed the failing request URL into the post body. For every city. For every cycle that ran during the outage. The wrapper script piped the output into snac note_unlisted without complaint. The posts went out, federated across the Fediverse, with my API key sitting in the text for anyone who cared to read.

                    Some users were kind enough to write me and tell me. Others were less kind, and made fun of me. Both groups were correct. This should not have happened.

                    I reported the incident to the Open-Meteo team, who were extremely understanding. They rotated the key immediately and gave me a fresh one. I removed the debug print, and then I did the slightly more useful thing, which was to add redaction at multiple layers - in the bot's output, in the daemon's logging, and in the debug helpers themselves. URL query parameters that look like API keys are masked. Environment variables and config keys named apikey or OPEN_METEO_APIKEY are redacted before any string reaches stdout or a log file. Even JSON-like fields that include open_meteo_apikey are scrubbed if they ever appear in something the program prints.

                    The lesson is not "be more careful." The lesson is that debug paths leak, sooner or later, so the secrets have to be unreachable from the debug paths in the first place. Now they are.

                    That afternoon, when I realised what was happening, I closed everything for a minute and looked out of the window. Then I started fixing.

                    Nominatim is a public service, and it is generous, but it is not infinite. Every city in the project needs coordinates, and at the start of the project every cycle would re-ask Nominatim for every city. Most of the time this worked. Sometimes it did not.

                    There was one cycle, before I added caching, when Nominatim simply did not respond for one of my queries. The geopy call timed out. The bot raised an exception. The wrapper script gave up on that city and moved on to the next one. A few users noticed that a particular city had not received its forecast that day, and asked what had happened.

                    I added a coordinate cache, and I am still grateful that I did.

                    The cache is intentionally boring. The first time the bot resolves a city, it writes the latitude and longitude into a small file under /tmp, named after the city, and the state when present. Every subsequent run reads the file. If the file exists, no Nominatim call is made. If the file is missing, the bot calls Nominatim and writes the file. After the first successful lookup, the cache becomes the source of truth for the coordinates of that city.

                    This is lighter on Nominatim, faster for every cycle, and much more resilient against transient failures. It is also nice for a reason I did not anticipate.

                    Nominatim is a geocoder, and like every geocoder it has opinions.

                    I live in Ferrara, so when I added Italy I made sure Ferrara was in the list, and I checked the first cycle to make sure everything looked right. The forecast came out fine. The temperature was reasonable. The icon matched the sky outside my window. I closed the laptop and forgot about it.

                    Then, one evening months later, I looked more carefully at the coordinates Nominatim had returned for "Ferrara, Italy", and I realised they did not point to the city. They pointed to a location closer to the centroid of the province, which is a much larger area and mostly countryside. The forecast had been, on average, for a field somewhere outside town, not for the city center.

                    I am not entirely sure why I had not noticed earlier. Probably because the weather in Ferrara and the weather in the fields outside Ferrara is, on most days, indistinguishable to anyone who is not paying attention. But this is the kind of detail I do not want to leave wrong, especially for my own city.

                    There are other places where geocoding lands slightly off. Sometimes it is a few kilometers, sometimes a different neighborhood, sometimes genuinely the wrong place.

                    Because the cache is just a file per city, the fix is also just a file per city. I open the cache file, replace the latitude and longitude with the correct values, save. The next cycle uses the corrected coordinates. No code change, no redeploy, no special tooling. I keep a small list of patched cities in a separate text file, so that if I ever rebuild the cache, I do not lose the manual corrections.

                    This is the kind of operational simplicity I like. A cache made of plain files costs almost nothing and quietly pays back every time a small problem appears.

                    For every report it generates, the bot also writes a simplified English text snapshot to /tmp/.txt, or /tmp/__.txt when there is a state.

                    This is intentional, and it is not a debug artifact. I am not ready to say what I am doing with it yet, but it is part of a future direction for the project. Text is a useful intermediate format, and having a clean, language-neutral representation of every forecast sitting on disk costs almost nothing and might be worth a great deal later.

                    I prefer to let ideas mature in private before I commit to them in public. So I will leave it at this for the moment.

                    A full cycle for the United States takes hours.

                    It is not because the work is heavy. It is because I deliberately inserted a small sleep between cities, to give snac time to dispatch the previous post before the next one is generated. With more than 1200 cities in series, even a short pause adds up. I am not in a hurry. Forecasts that arrive a few minutes apart from each other are not a problem, and the bot was already a polite citizen elsewhere. A polite cycle is fine.

                    The problem with a slow cycle is not the duration. The problem is what happens to it.

                    In the original design, the cycle was launched by cron. Every six hours, cron called the wrapper script, the wrapper iterated through the cities file, and for each city it ran the bot and piped the output into snac. There was no scheduler in the project at all. Cron was the scheduler. The wrapper was just a loop.

                    Restarting snac was harmless. The wrapper would call snac note_unlisted per city, and if snac happened to be unavailable for a moment, that single call might fail, but the loop kept moving and snac was usually back within seconds. Snac itself was not what held the cycle together.

                    What held the cycle together was the wrapper process. And the wrapper process lived inside the jail.

                    If the FreeBSD jail was restarted while the wrapper was running, the loop stopped wherever it happened to be. The cron schedule did not care. Six hours later, the next cron tick started a new cycle from the first city, and the cities that had been about to be processed at the moment of the restart were simply skipped for that window. For the United States, this could mean several hundred cities going without an update.

                    There was a worse case, and it took me longer than it should have to recognise it. If the host was rebooting exactly in the minute when cron should have fired, cron simply did not fire. There was no daemon waiting to pick up the missed tick. The cycle never even started. Six hours of forecasts would be lost, in silence, with nothing in any log to suggest anything had gone wrong.

                    I lived with this for a long time. Reboots were rare, the impact was limited, and adding state was the kind of thing I always meant to do "next week."

                    What finally changed it was not a dramatic incident. It was the slow accumulation of small ones. A scheduled VPS reboot. A jail restart after an upgrade. Each one on its own was nothing. Together, they were a steady drip of missed cycles.

                    So I wrote a daemon.

                    The crontab entries for the bot went away. There is now a long-running process inside the jail, started at boot, and it does the scheduling itself. The schedule is a list of hours and a minute, read from a JSON config. The daemon wakes up once a minute, checks whether it is time to start a cycle, and either starts one or waits.

                    The interesting part is the state file.

                    As the daemon walks through the cities file, it writes its position to a small JSON file: which cities file it is processing, and the index of the next city to handle. The write happens at the boundary between one city and the next, because that is the only place where resuming makes sense. If the daemon is interrupted mid-city, that city is retried on resume; no half-finished post escapes.

                    When the daemon starts, it reads the state file. If it finds one matching the current cities file, it resumes from the saved index. If the cities file has changed since the state was written, the daemon starts fresh. The check is deliberately conservative: a renamed or modified cities file is treated as a different cycle, because the indices would otherwise be meaningless.

                    The result is the behavior I should have had from the start. If the host reboots while the United States cycle is running, the daemon comes back up with the jail, reads the state, and continues from where it left off. Every city still gets its update, just with a small gap corresponding to the reboot itself. The cycle finishes. The state file is reset. Life goes on.

                    And the worst case from the cron days is gone. The daemon does not need anyone to fire it. As long as the jail is running, the daemon is running, and the next scheduled cycle will happen when its hour comes, regardless of what was happening at any specific minute.

                    Of all the changes I have made to the project, this is the one I like most. It is not exciting work. It is the kind of thing that earns no applause because, when it works, it produces no visible event. But it removes a whole class of small daily annoyances, and it makes a slow process robust against the boring kind of failure: the kind nobody plans for, but that always eventually happens.

                    The current bot does considerably more than the original Italian script. It handles per-city timezones, three weather providers with automatic fallback, unit conversion for temperature, wind, and precipitation, optional air quality, pressure trend indicators when the provider supplies pressure data, a simplified English text snapshot for future use, a coordinate cache that can be patched by hand, secret redaction at multiple layers, a heartbeat that adapts to whichever HTTP client is installed on the host, and a scheduler-and-resume daemon that survives reboots.

                    But from the outside, almost nothing has changed.

                    The European country configs work the same way they always did. The wrapper scripts are unchanged. The snac integration is the same one-line pipe. The HAProxy layer in front does not know or care that the bot was rewritten. The homepage cron that counts followers and regenerates the static page works exactly as before.

                    The original Italian script does not exist as a file anymore, but it survives as a default. A country config with timezone set to Europe/Rome and no special options behaves, today, exactly as the first version of the bot would have. Everything else is opt-in.

                    I like this kind of work.

                    https://it-notes.dragas.net/2026/05/25/fedimeteo-timezones-and-the-art-of-not-breaking-what-already-works/