A little bit ago, I warned of insecure architecture risks in BluEsky, which facilitate surveillance. On the other hand (as some have commented to me privately) there has been a ballooning number of “artists” visualizing what they can see with a federated protocol that offers “efficiency” for surveillance.
One of the core primitives of the AT Protocol that underlies Bluesky is the firehose. It is an authenticated stream of events used to efficiently sync user updates (posts, likes, follows, handle changes, etc).
Many applications people will want to build on top of atproto and Bluesky will start with the firehose, from feed generators to labelers, to bots and search engines.
In the atproto ecosystem, there are many different endpoints that serve firehose APIs. Each PDS serves a stream of all of the activity on the repos it is responsible for. From there, relays aggregate the streams of any PDS who requests it into a single unified stream.
This makes the job of downstream consumers much easier, as you can get all the data from a single location. The main relay for Bluesky is bsky.network, which we use in the examples below.
Their example code has given birth to a number of “artistic” endeavors. Here are but a few.
EmoJirain (I know, it’s supposed to say emoji, but who doesn’t see this as emo?)
RainBowsky (I know, it’s supposed to say rainbow, but the Russian in me sees bowsky):
InTothEbluEsky:
FirEhose3D:
NightSky:
Need I go on?
FinalWords prints all the text being deleted so there’s a record of things people want to make disappear, 3D Connections is a graph of everyone’s associations, Emotions is a live display of sentiment online…
Whee! Surveillance features can be repackaged as creative tools.
These “artistic” visualizations aren’t just pretty pictures, they offer live demonstrations of mass surveillance capabilities:
- EmoJirain and BluEskyEmo show real-time monitoring and classification of user emotional expression
- RainBowsky and InTothEbluEsky prove continuous scanning and pattern matching of all user content
- FirEhose3D and NightSky demonstrate real-time tracking of user activity and interaction patterns
- 3D Connections maps personal relationships and social networks across the entire platform
- FinalWords archives deleted content that users specifically wanted removed
- Emotions conducts mass-scale sentiment analysis of the entire user base
Each tool leverages the same centralized firehose of user data, just with a different veneer painted over surveillance capabilities.
While today we see emoji rain, tomorrow the same firehose could be used for… behavior pattern analysis and user profiling, network mapping of user relationships and communities, content monitoring for any topic of interest, real-time tracking of information spread, mass collection of user metadata (post times, devices, engagement patterns)… oh, hold on, that’s already happening.
The artistic expressions are processing the entire firehose of user activity, and who knows where they are physically, with a “friendlier” output than the operators of the infamous room 641a of San Francisco.
Thus the firehose feature fundamentally creates a broad attack surface by design and we are seeing it deployed. Bluesky, or is it BlueSky, …FireHose or FirEhose? Either way we’re literally talking about intentional access to all user activities. The architectural choice to create a centralized “firehose” of all user activity fundamentally undermines claims of decentralization.
Who ordered the complete visibility into centralized user behavior at scale?
Well, as they say in the docs, “relays aggregate the streams…into a single unified stream” because why?
rsc := &events.RepoStreamCallbacks{ RepoCommit: func(evt *atproto.SyncSubscribeRepos_Commit) error { fmt.Println("Event from ", evt.Repo) for _, op := range evt.Ops { fmt.Printf(" - %s record %s\n", op.Action, op.Path) } return nil }, }
I’ll say it again.
Why?
The simplicity of the BluEsky example code isn’t just poor documentation about the risks, it clearly reflects an architecture decision to increase “efficiency” against privacy protection.
Look mom, just three lines of code is all it takes for you to tap into every user action across the platform!
While the example code shows how to technically connect to a centralized stream, it more importantly raises obvious critical security considerations that everyone should consider. I’m not exposing vulnerabilities in code — because that probably makes everything worse right now — but rather talking here about management decision to push “efficiency” into an architecture that begs surveillance and abuse.
- Volume of data
- Storage and processing of user activity data
- Authentication and rate limits
- Abuse of streams
The fact “art” is the motive, instead yet of targeted assassinations or mass deportations, doesn’t make BlueSky publishing code and docs for surveillance any less concerning.
This wouldn’t be the first time surveillance was dressed up in artistic clothing without explanation. In fact, the parallels to history are striking.
Recently I spoke with survivors of the East German Stasi infiltration of artistic communities (1970s-1980s). The state police saw cultural spaces such as galleries as opportunities for surveillance, especially related to cafes like Potsdam’s HEIDER.
The “avant-garde” artists actually worked as informants. This was arguably and extension of the Soviet Composers’ Union that monitored artistic expression.
Ok historians, let’s be honest here, this problem hits much closer to home than Americans like to admit. President Jackson and President Wilson were horrible abusers of surveillance, infamously using state apparatus to intercept and inspect all postal mail and all telephone calls. But we’re really talking about modern precedents like the GCHQ and NSA operation Optic Nerve 2008-2010 on Yahoo (years after I quit, please note) that sucked up a firehose of webcam images in a state-sponsored “art project”. And then the Google Arts & Culture face-matching app (2018) collected massive amounts of biometric data under the guise of matching people to classical paintings…
Wait a minute!
Optic Nerve (2008-2010) predated the ImageNet competition (2009-2017), based on unethical privacy violations by a Stanford team, that sparked the “big data” revolution we’re now swimming in.
Are we seeing history rhyme again with BlueSky’s “artistic” firehose? Surveillance keeps reinventing itself while using the same playbook.
Something smells rotten in BluEsky, and no amount of that EmoJirain is going to mask it for those who remember past abuses.