Encryption
As we know, NextGraph is local-first, and synchronizes its data with other replicas (other devices running the app) when it has internet connectivity.
There exist already plenty of protocols in OSS that could be used for syncing, some of them being End-to-end encrypted (E2EE).
ActivityPub, XMPP, OpenMLS, Signal, Matrix, Nostr, ATproto/Bluesky, etc… can all be used as transport protocols for the syncing mechanism.
Why did NextGraph have to reinvent the wheel and create a special protocol, once again ?
And specially knowing that doing so, seems counter productive.
The answer is: because we didn’t have the choice but to create a new protocol, because none of the existing ones had the characteristics we need for NextGraph to deliver on its promise. More details coming now.
The goal is to have a decentralized protocol that does not depend on any central authority or single point of failure.
That’s easy to say… and a bit more difficult to put in practice, specially when E2EE enters into play.
Here is the breakdown of the problem.
No more DNS, TLS nor HTTPS
DNS and TLS certificates are not fully decentralized and they both rely on some single points of failure (and domains are expensive, and hard to setup and manage for the end-user). TLS was discarded by the creator of Signal Protocol by example, because his main goal was to offer a secure and totally decentralized chat protocol. We also have to forgo DNS and TLS in our architecture, for similar reasons. That’s not an easy task. Matrix, by example, that is just another iteration on the above mentioned Signal Protocol (reimplemented in permissive licensing, and with a slightly different protocol for group encryption), felt the need to wrap all the E2EE protocol within some classical REST apis with TLS and a domain name, so basically they voided the advantage that Signal had created: no need for DNS nor TLS certificates. The result is Synapse protocol and Home Server: a mix of E2EE and plain old TLS APIs. That’s not what we did. In our protocol, every peer is known by its IP and a PeerID that is a public key. There is no DNS, there is no TLS certificate. We use the Noise protocol for encryption of communication between peers, which is, for now, channeled into a WebSocket, but could be transported over other protocols, including UDP, QUIC or WebRTC, by example.
P2P isn’t viable. we need 2-Tier network with Brokers
Pure Peer to Peer (P2P) is not practical in real life. Even though we claim NextGraph is peer-to-peer (and it is, somehow), our network topology is not exactly peer-to-peer. What we do is called “2-tier topology”. It means that the peers/replicas do not communicate directly with each other. Instead, they send and receive their data from some tiers that we call “Brokers”. The broker is a server that is always up and running, and that replicas can always contact in order to sync their data. To the contrary, a truly P2P topology would imply that the replicas are connecting directly to each other. This is what HyperHyperSpace is doing by example, with WebRTC that establishes connections directly between web-browsers. Several other protocols are doing direct P2P communication. While it looks good on the paper, and is attractive because it seems at first glance as a desirable feature, it turns out that true P2P is not practical because it forces the peers/replicas that want to sync and exchange data, to be both online at the very same moment. This condition is not guaranteed at all and will impede synchronization if not met. What are the chances that any participant of a collaborative group are connected to the internet at the very same time? very low in real life cases. If I want to sync between several of my own devices, does that mean that at least 2 of my devices have to be online at the same time? this is not gonna work. For this reason, we opted for a “2-tier” topology. Peers that are intermittently connected to the internet, can still synchronize because they will talk only to a common “broker” that is always up and running and that is there specially for that. In fact each broker is connected to the other brokers in the overlay, so it doesn’t matter to which broker the client is connected.
End-to-End Encryption of everything
Now comes another problem. If the concept of a “server” is back in the picture, what about the guarantees on privacy and single point of failure (SPOF)? Well, we have thought about that. Concerns for SPOF is easy to clear if you allow many brokers to be available at the same time, if the broker can be self-hosted by anyone, and if the user can switch to failover brokers transparently. This is what we did in our protocol design. And about concerns for privacy, the solution is very simple (at least, on the paper): End-to-End-Encryption, sometimes called zero-knowledge. The broker only receives and forwards messages that are E2E encrypted. It cannot read what is in the messages. The only thing it can see is the IP of the devices, and their PeerID (a public key). Only the replicas (the clients) can decrypt the messages. So basically, we reinvented a group E2EE protocol? yes! and why not reuse Signal protocol, Matrix protocol, OMEMO or OpenMLS that already have a very well established Double-Rachet E2EE mechanism? Well, this also could not have been done, because of the following issues.
Group encryption is not enough
All 3 existing protocols I just mentioned, are derived from the first one: double ratchet by Moxie Marlinspike et al. This protocol is very good at one thing: enforce privacy for the participants within the group. And prevent any other agent that is outside of the group, from seeing any plaintext (the messages in clear, unencrypted form). The guarantees are simple: Or you are inside the group and you can see everything within this group. Or you are outside of the group and you cannot see anything. Very nice! Very useful. Everybody uses it on a daily basis, in WhatsApp, by example. In addition, it adds a guarantee called Perfect Forward Secrecy (PFS) which has a negative side-effect for newcomers that will join the group later on (that are invited to join after a while). They will not be able to see the historical content of the group. That’s good for paranoid people who are afraid a compromised key would let unauthorized people see all the past conversation. And it also is based on the concept that this compromised key would be detected, and the intruder would be kicked out fast enough so he doesn’t see any newer messages neither (that’s Post Compromise Security), which is another problem difficult to solve. Anyway, it turns out that PFS is an anti-feature in our case, because we obviously want a joining collaborator to have access to all the historical data of a DAG of modification of a document. If they wouldn’t have that, they wouldn’t be able to reconstruct the document’s content at all. This is a problem that Matrix is facing. So basically, we do not want PFS. And this problem is being addressed in a new draft specification called MIMI, a working group of the IETF. But even if this is solved, I am afraid it will not be enough for our use cases.
Like with the double ratchet, we want to renew epoch from time to time. Specially when we revoke the access of a specific user. In this case, the encryption keys need to be refreshed. We can also do so in a periodic manner, for extra security. Another reason why existing group E2EE protocols are not adapted for us, is that the “all or nothing“ access pattern is not what we need neither.
A repo has editors, viewers and signers
In NextGraph, each Document has its own E2EE group, that we call a Repo. Now, a Repo has editors, obviously, and those should be inside the E2EE group. Sure thing. Then, we also have users that are only “readers”. they cannot edit the document. Should they be part of the E2EE group? how are the permissions going to be enforced? who is going to check if this specific user can edit or not, in a local-first setting? NextGraph uses the concept of capabilities (OCAP) for permissions. A user can read the content of a document if they have the capability to do so. Furthermore, the capability system that we have designed, is not a traditional OCAP mechanism where a “controller” eventually decides on what to do with an incoming action and associated capability. Instead, in NextGraph, there is no need for an almighty controller. We have designed a capability mechanism based on cryptography. If the user possesses the decryption key, then they are entitled to read the data. There is no need for a “controller” to verify the signature of the capability/delegation they hold. Instead, the user just gets some encrypted data, and if they have the correct decryption key, then it means they will be able to decrypt and read the data. if they don’t, well, they will see garbage instead. About the “write” permission, it is the same. The editor needs to possess the private key of the Pub/Sub topic that is used to propagate the edits (more on the pub/sub later). So basically, if a user wants to modify the data, they can try as much as they want locally, but if they don’t have the private key of the pub/sub topic, it will not work because they will not be able to propagate their updates in the overlay (the brokers will refuse it). Again, there is no need for a “controller” here that will check a capability, or better said: the capability is included in the message. The “write” permission is just enforced with cryptography.
So, coming back to the problem of the E2EE group, we have several types of users with different types of access, that interact within the Repo. in addition to the editors and the readers, we also have signers. Those signers have only one task: Verify and Sign the commits, on request. They are not necessarily editors. But they do need read access in order to read what they are going to sign. So we have 3 types of access (and a 4th type for the owners of the documents). Anyway. it starts to seem obvious that a classical E2EE group like OpenMLS or Matrix will not be enough, because we do not want the readers, by example, to see what the editors are doing. We don’t even want them to see the list of editors. Are we going to create 4 different OpenMLS group? one for each access type? No. That would not work in practice, and not even on the paper.
So here we are, we designed and implemented a new E2EE group protocol, that takes into account the specific needs of Document editing, signing and sharing, together with cryptographic capabilities. And this… did not exist, as far as we know. Add to this the fact that we also do content-addressing and deduplication with convergent encryption on all the content and pub/sub messages, and you will soon understand why a novel protocol was needed.
All this gives us a general overview of what our sync protocol can do, how it does it, and why we had to design a new protocol.
More details about the protocol itself can be found in the Design section below.
But for now, let’s move on to explore our Wallet and what it contains.