An overview of Matrix, who it targets and privacy considerations
Jul 23, 2024I've blogged about Matrix and my relationship with it in the past. In short, I used to run my own XMPP server since 2016, but in 2018 I gave up on XMPP and moved to a self-hosted Matrix server. Ever since, for better or worse, Matrix has been my one and only (besides email) means of communication with people. In the beginning of 2020 I joined Element, the at the time main company behind Matrix. I left Element towards the end of 2022.
I believe the above makes it clear that I really want to see Matrix succeed, however there are various use cases that Matrix isn't suited for currently and I want to explain how some things in the Matrix ecosystem work currently.
Note: I wrote this post in the beggining of 2024, but I ended up not publishing it because it felt like a rant. I'm revisiting this now (late July 2024) because even though it is a rant, it's a rant that answers questions friends ask me.
Matrix the protocol and the Matrix.org Foundation
I think we first have to explain what Matrix currently is, who "controls" it and what's planned in future versions. According to the about page of matrix.org, "Matrix is an open protocol for decentralised, secure communications". Without going into too much detail, on the technical side, Matrix is a collection of HTTP JSON APIs that enable the creation of a federated and decentralized network for communications. I want to point out here, that this doesn't necessarily mean communications among humans, Matrix is a protocol to exchange messages in general. This can be between humans, devices, systems, etc.
The specification of these APIs is what's called the Matrix Spec, usually referred simply as the spec. The spec is owned by the Matrix.org Foundation, a non-profit UK CIC which is meant to be the steward of various Matrix bits and pieces and the Matrix community. The history of the Foundation and its relation to Element can be somewhat gleamed from past blog posts in Matrix.org and various comments in online forums, but it's also complicated. There's a GitHub issue to properly document this. A couple of months ago, this GitHub issue which I wasn't aware of ended up on the front-page of HackerNews, showcasing some the complex relationship between Element and the Foundation.
On September of 2023, the Foundation announced that there's a new Managing Director for the Foundation, my understanding is that no such role existed in the Foundation before this. I view this as a very welcome and important change, as from what I've seen so far Josh seems to be really on-board on making Matrix a trully open and community driven protocol. There's still lots of work to done, however I applaud all the efforts taken by everyone involed so far. The Matrix room for the Foundation, while sometimes going wildly off-topic, is in my opinion one of the few rooms that's worth keeping an eye on as it's the place where things involving the community are discussed.
The Matrix Spec
The spec is licensed under the Apache 2 license and anyone is welcome to contribute to it. Writing a Matrix Spec Proposal (MSC) to propose a change in the spec can be both an easy and a hard process, depending on what is proposed. For core changes, a working software implementation must also exist to showcase that the change makes sense in the real world. As such, there exist some MSCs that have been implemented for years, but are stuck in a review loop/state, yet they're the de dacto way of doing things currently. Sorting the currently open proposals by number of comments, we can see that there are important proposals and features with multiple hundreds of comments open for more than three to five years.
Reviewing and accepting MSCs is the responsibility of the Spec Core Team (SCT), a team of people assigned by the Foundation that makes the final decisions on the spec. At the time of writing this, seven out of the ten people in the SCT seem to be employeed by Element according to the SCT page. This comment in HN explains how this came to be (tl;dr community members turned Element employees), but the point remains. Up until mid February 2024 the SCT was eight people with seven of them working for Element.
I have never attempted to write a MSC, so I can't speak of the process personally, but I've read multiple discussion threads in proposals to know I don't have the mental capacity to follow through the process. This recent MSC proposed some changes in the process trying to reduce the dependency on SCT and make things easier for everyone.
Target audience of Matrix
Now that we have some context, let's circle back to what Matrix is and who the target audience of it is. Specifying a target audience for Matrix is very hard, to me it seems to be everyone and everything. Because the protocol lends itself nicely to being a generic messaging protocol for federated services, it's become a platform to build things on top of it. Third Room is one such example, Element Call is another.
In a presentation in FOSDEM 2024, a slide from the Element's investor pitch in 2019 was shown. The slide reads "5 years from now, everyone will communicate via Matrix. Matrix will replace the phone network and email." While certainly an audacious claim, it makes it clear what Element wants to achieve with Matrix. A protocol to encapsulate all communications. The original slide mentioned ten years, five years have passed since and the slide was changed to match this.
I don't think that's a bad thing and I believe that we need such a protocol to exist. I also see why investors would like that and it makes a great sales pitch. However, this explains why things are the way they are in Matrix currently and why I earlier said that the target audience for Matrix is everyone and everything.
This attempt to become such a generic protocol, has lead to Matrix having some important privacy and real world problems that while known in the community, haven't been addressed for years.
Metadata
The importance of metadata has been known for years, at the very least since the Snowden leaks. I remember reading EFF's "Why metadata matters" page after seeing this slide live in 30C3. An ex-NSA director has famously said "We kill people based on metadata".
So, it's my belief that any protocols or apps that want to be considered secure and privacy conscious, should also care about metadata. Matrix unfortunately in this regard doesn't have the best track record. If you search for Matrix and metadata online you'll find various forums and blog posts being very vocal about how Matrix is terribly bad when it comes to metadata. Some of these claims are incorrect, while others do have some merrit in them. It's hard to state what is important and what isn't, without first formalizing a proper threat model. However, I'll try to list the problems I'm aware of and everyone should judge for themselves if they're important or not based on their own threat model.
Non-encrypted information
While Matrix has support for end to end encryption (e2ee) using the Double Ratchet Algorithm, some of the data in a Matrix room remain unencrypted. These include:
- Reactions to messages
- Redacting a message. When deleting a message, a user can also include a reason why this message was deleted. This reasoning field isn't encrypted.
- The avatar/icon of the room
- The topic of a room
- Room participants, their avatars and nicks, their power levels (if they're admins or not), timestamps of their actions.
All of the above, can be seen by the administrators of all participating servers in a room.
Information leaks
While I don't think this is something that should be done on a protocol level, it's common nowadays (for the past few years that is) to strip EXIF metadata from user uploaded images in services. The flagship client (Element) currently doesn't do this, although the relevant issues and PRs exist for years, they never actually got merged due to various issues.
By default, the display name and avatar of a user are publicly accessible if federation is enabled, despite the fact that a user might only be a part of private/invite only rooms. As such, a workaround some people use is to have a blank image as avatar and an empty displayname and change these on a per room basis.
Uploaded files in a Matrix room are publicly accessible, both in unencrypted rooms and in encrypted rooms. Theoritically, it's not that much of a problem since one needs to know the URL of the file to access it. However, it can be used as a very effective abuse avenue similar to what is mentioned here. I won't get into more details to a worse version of how this can be abused, but it can get much worse than that. This is somewhat addressed via MSC3916 that was included in Matrix Spec v1.11 released June 20th of 2024.
In a federation everyone has your data forever
Due to the federated nature of Matrix, the data in a (federated) room are replicated to all participating servers, this includes both the e2ee data and the unencrypted data mentioned previously. This is done by design and is generally how federated systems work. However, it should be noted that it means that your data and metadata will be around for as long as the last remaining server with your data is still around.
In Matrix, it's possible to request all the data of a federated room from any server that's participating in the room (assumming you have the permissions to do so). This is useful when you want to backfill information about a room that you joined long after it was created. It's also useful when your server dies and you don't have backups of it, like it happened on mine.
As I explained previously, it's possible to use this feature to recover information for rooms in domains that have expired. Again, this is mentioned in the docs, but it's not something that will be fixed any time soon.
In theory, self-destructing/disappearing messages would help in this area if homeservers follow the spec, but that too is a band-aid to this problem and even though it's a much requested feature, it's not implemented yet.
FOSS that's not run by a community
So, who should use Matrix and can we fix those issues?
I'll start by saying that I don't know if all of these issues can be fixed. Some of them can certainly be fixed, others might
Since we're currently in a state of surveillance capitalism and