Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add session resumption through seeds #17

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions client-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,77 @@ will be released, and the WebSocket connection will be dropped.

Now the client connection is fully set up, and the application specific messages
(those with numeric phases) may be exchanged.

## Wormhole Seeds

Once two clients ever connected to each other, they now have a shared secret.
This can be used to establish a new Wormhole connection without involving human
entering codes. If A says "I want to connect to B" and B does the same they'll
find each other and get a secure connection. Some additional data needs to be
exchanged and stored in order to allow for a good user experience.

Support for session resumption is declared using the
`seeds-v1` ability during the `versions` phase. Additionally, a `seeds`
key must be added to the versions message that roughly looks like this:

```json
{
"abilities": [ "seeds-v1" ],
"app_versions": {},
"seeds": {
"display_names": [<string>],
"known_seeds": [<string>],
},
}
```

A client may choose a list of `display_names` in order to be recognizable. Note
that client names may be arbitrary, collide with other sessions or change over
time. Any valid UTF-8 string may be used as name, except for the following
characters: `'`, `"` and `,`.

It is up to the clients to keep track of such a mapping and keep it up to
date, if they want to. It is also up to the clients to name themselves.
We recommend giving at least two values: one with the user's
name and one that also disambiguate multiple devices the user may have (now or in
the future). The list must be sorted in decreasing order of preference.

A seed is derived from the shared session key like this:

```python
# `derive(key, purpose)` is the usual key derivation function
seed = hex(derive(session_key, "wormhole:seed"))
```

The `seed` is the main shared secret between the peers and all other data will
be derived from it:

```python
password = hex(derive(seed, "wormhole:seed:password"))
nameplate = hex(derive(seed, "wormhole:seed:nameplate"))
Copy link
Member Author

@piegamesde piegamesde Mar 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a high entropy nameplate instead of connecting to a mailbox directly solves a lot of the connection issues that I previously had. Notably, multiple concurrent independent connections are possible using the same seed because they will end up in a different mailbox. Also, for the same reason, the error recovery when a mailbox gets "crowded" is a lot quicker.

The current rendezvous server implementation supports this without issues, but nevertheless it may be a good idea to codify this in the protocol that we now rely on this possibility. Edit: added a new commit specifying "high entropy nameplates".

```

To "grow" a seed (resume a connection), both sides connect to the rendezvous server
using `${nameplate}-${password}` as code. The code is entered automatically without
user interaction. Setting the `seeds-v1` ability in the `versions` phase is not
required anymore.

On normal connections where both sides support the seeds ability, clients may
wish to know whether they already share a seed in common with the peer. For this,
they may specify all their known seeds into the `known_seeds` list, but hashed
with a key derivation function using the raw (i.e. not hex-encoded) session ID
as purpose. By simple set intersection, they will then find out the seeds they
have in common (provided that both sides act faithfully). The process is equivalent
to a simple *private set intersection* protocol, meaning that as long as the
session ID is unique no sensitive contact graph information will be leaked.

### Client implementation notes

- Clients should notify the user about the display names feature, or even provide
opt-in. For some people, user name or device name are sensitive information.
- It is up to the clients if they want to make pairings explicit or automatic.
- Seeds only work if both sides store the seed. A seed only stored on one side
will not function. Clients must deal with this scenario.
- An expiration time of 12 months for explicitly stored seeds is recommended.
Automatically stored seeds (e.g. for session resumption) should be expire after
1-14 days, depending on the use case.
21 changes: 18 additions & 3 deletions server-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,16 @@ Mailboxes are identified by a large random string. "Nameplates", in contrast,
have short numeric identities: in a wormhole code like "4-purple-sausages",
the "4" is the nameplate.

Each client has a randomly-generated "side", a short hex string, used to
differentiate between echoes of a client's own message, and real messages
from the other client.
Each client has a "side", a short hex string coming from a CSPRNG (yes,
despite the low entropy it should come from a true random generator). Its
main purpose is to differentiate between echoes of a client's own message
and real messages from the other client. It might also be used to give each
session a unique identification, and to feed cryptographic challenges in
other parts of the protocol (think of it as a salt). For this purpose, we
also define a "session ID" as follows: We sort both sides lexicographically
(comparing the first, second, etc. bytes) and concatenate the lower after the
higher one.


## Application IDs

Expand Down Expand Up @@ -214,6 +221,14 @@ may record additional attributes in the nameplate records, specifically a
wordlist identifier and a code length (again to help with code-completion on
the receiver).

### High entropy nameplates

Nameplates in the range above 1000000 (one million) and alphanumeric nameplates
are called "high entropy". They should not be used for the `allocate` command,
and should instead always be claimed directly by clients. It is up to the clients
to use sufficient entropy in order make accidental collisions rare. Furthermore,
high entropy nameplates should not exempt from the `list` command.

## Mailboxes

The server provides a single "Mailbox" to each pair of connecting Wormhole
Expand Down