Introspecting Public Keys in ATProto#
ATProto is a decentralised network for building social applications. Users in ATProto store their data in cryptographically signed repositories, which can be freely accessed by anyone who's interested.
Every user can be uniquely identified by their Decentralized Identifier (DID), which can be used to resolve their DID document through tools like the PLC Directory. This document specifies a public key that can be used to verify the authenticity of the data in a user's repository.
I recently built the ability to introspect these public keys for ATProto Browser. This taught me a lot about how cryptography works in this network, and the ecosystem of tools that support it. As an advocate of learning by teaching (the Feynman Technique), I'll use this post to explain everything I learnt to solidify my understanding of these concepts.
Cryptography in ATProto#
The official spec for cryptography in ATProto is fairly dense, so here's a brief summary:
- ATProto uses elliptic curve cryptography.
- It supports two curves:
k256andp256.k256is the default. - Future revisions to the protocol may allow for other curves.
- A new keypair is generated for every new user.
- The public key is stored on the user's DID document, and is available for everyone to inspect.
Here's what my DID document looks like (some parts omitted for brevity).
{
"@context": [...],
"alsoKnownAs": ["at://haroldadmin.com"],
"id": "did:plc:r7bnnxqejdsgapuagfc6dlz6",
"service": [...],
"verificationMethod": [
{
"controller": "did:plc:r7bnnxqejdsgapuagfc6dlz6",
"id": "did:plc:r7bnnxqejdsgapuagfc6dlz6#atproto",
"publicKeyMultibase": "zQ3shQMayW5QFrJr6Bggei8D8T4mNtKLvkcMd9tbejTTjeNYu",
"type": "Multikey"
}
]
}Note the verificationMethod array that contains my public key (publicKeyMultibase), and that the key type is Multikey.
But what does any of this actually mean?
What's multibase?
What's a multikey?
What's a key?
Cryptographic Keys#
This is probably common knowledge to most people, but it was new to me: keys in cryptography are sequences of bytes.
Sometimes the bytes may represent a very large number, and sometimes a coordinate in an X-Y plane. Regardless of what the key represents, it's always a sequence of bytes.
Now, working with bytes is easy for computers, but hard for humans. To make them human-friendly, we often convert them to text using an encoding.
You may have come across several encoding formats in your day to day life:
- Colours in CSS are often encoded as hexadecimal codes: #d81e5b
- Inline images in HTML use the Base64 encoding:
data:image/png;base64,... - The HTML text of this blog is encoded using UTF-8:
<meta charset='utf-8'>
Our choice of encoding is important -- to convert text back to bytes, we need to know which encoding was used to encode it. Similarly, we need to know which curve was used to generate the key in order to make sense of it.
Wouldn't it be nice if our keys were self-describing? What if we could infer the key's encoding and cryptographic curve from the key itself?
This is where Multibase and Multicodec come in.
Multibase#
Multibase is a protocol to create self-describing base encodings, i.e., the encoded text can be used to infer the encoding used for it.
Since ATProto keys are multibase encoded, we can look at the prefix of my public key to infer its encoding.
The first character of my public key is z. From the multibase spec table, we can infer that z maps to base58btc.
This means my key uses Base58 encoding!
Next up, how do we infer which cryptographic curve was used to generate it?
Multicodec#
Multicodec is a protocol to create self-describing codecs for binary data.
Similar to how a multibase prefix can make a string self-describing with respect to its encoding, a multicodec prefix for a sequence of bytes can make it self-describing with respect to its codec.
If we decode my key using the base58btc decoder, we get back the following bytes:
import { bases } from "multiformats/basics";
const key = "zQ3shQMayW5QFrJr6Bggei8D8T4mNtKLvkcMd9tbejTTjeNYu";
const bytes = bases.base58btc.decode(key);
console.log(bytes); // Uint8Array(35) [231, 1, 3, 218, 159, ...]Note that .decode() removes the base prefix from the returned byte array, so bytes only represents the remaining Q3sh... part of my key.
Upon inspection, we can see that my public key has a prefix of 231. Looking at the Multicodec table, we can see that 231 corresponds to the secp256k1-pub codec.
This means the key was generated with the k256 curve! 1
Bringing it all together#
Multikeys follow this structure:
Through our analysis of my public key, we learnt that:
- Its multibase code is
z, which corresponds to thebase58btcencoding. - Its multicodec code is
231, which corresponds to thesecp256k1-pubcodec. - The remaining 33 bytes represent the actual key.
Most keys in ATProto will follow a similar format, so this introspection is not needed in most cases. Why bother using multikeys at all, then? It's because they make it trivial to support other cryptographic curves and encoding formats in future revisions of the protocol.
If you want to see this introspection in action, check out your ATProto repo on ATProto Browser!