-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3981: /relations
recursion
#3981
Changes from 16 commits
098b343
abddab8
4f61e43
381a355
bbcc81a
f7e5ca3
ffb316d
077f6b1
c638630
dad341e
6d88fe8
123eef1
65cab28
6baafd8
b6119c5
cfd4223
94561ad
f404254
03e8d68
0e50bdb
a4d8e73
a027354
6ace9f4
c160d36
bdae577
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# MSC3981: `/relations` recursion | ||
|
||
The [`/relations`] API allows clients to retrieve events which directly relate | ||
to a given event. | ||
|
||
This API has been used as basis of many other features and MSCs since then, | ||
including threads. | ||
|
||
Threads was one of the first usages of this API that allowed nested relations - | ||
an event may have an [`m.reaction`] or [`m.replace`] relation to another event, | ||
which in turn may have an `m.thread` relation to the thread root. | ||
|
||
This forms a tree of relations, which can currently only be traversed | ||
efficiently in hierarchical, but not in chronological order. Yet, for some | ||
functionality – e.g., to determine which event a read receipt should | ||
reference as per [MSC3771] – chronological order is necessary. | ||
anoadragon453 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
justjanne marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Previously, clients would be unable to obtain a consistent ordering of | ||
events in threads or related to threads. | ||
Workarounds such as sending a separate `/relations` request per individual | ||
message in each thread are only able to approximate the actual ordering, | ||
as they rely on timestamps. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Expanding from my previous comment asking for context, This paragraph touches a bit on the context of why which is good but not technically why. Perhaps an example like below but where it breaks down with the current implementation. Or point out in that example how it breaks down without this MSC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hopefully I've made this a lot more straightforward now, ie. for me it's really just a question of needing a way to get all the events you need to render a thread, and that includes relations to the thread events. |
||
|
||
## Proposal | ||
|
||
It is proposed to add the `recurse` parameter to the `/relations` API, defined | ||
as follows: | ||
|
||
> Whether to include events which relate only indirectly to the given event. | ||
anoadragon453 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
> | ||
> If set to false, only events which have direct a relation with the given | ||
> event will be included. | ||
> | ||
> If set to true, all events which relate to the given event, or relate to | ||
> events that relate to the given event, will be included. | ||
> | ||
> It is recommended that at least 3 levels of relationships are traversed. | ||
uhoreg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
> Implementations should be careful to not infinitely recurse. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph weakens the reliability of the behaviour, IMO, because the most pessimistic outcome of using this API would be that you only have 3 levels of recursively related events, while the behaviour of infinite recursion is not defined. To address the levels problem, what about making this parameter (or another parameter) accept the amount of levels a client would see the server want to try to resolve? (The server would simply go up to that many levels of search, specifying To address the issue of infinite recursion, while I'm currently not aware of any way an event can be edited to re-define its relation to another event, it doesn't mean that such a feature could exist in the future (to, for example, edit the message you've replied to, when you made a mistake). As such, what about specifying that, if a server detects a cycle, it will simply cut it off, and return as-is? This way, the guarantee that it returns all recursively-related events at least once will still be satisfied. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
As there are currently no relations defined that allow you to get more than 3 levels of relations, that's an entirely academic exercise for the time being. I'd suggest that we change this if and when we've defined types of relations that can nest deeper than this. At that point we'd also have a better understanding of what the edge cases might look like. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My opinion is is that this adds tech debt, if the intent is to recurse a "reasonable" amount of times, and "reasonable" is considered to be 3 levels - due to the current set of relations there exists - at the time of writing, then that wording should be used. An exact number shouldn't be used, because that bit of behaviour will surely become outdated when more relation types emerge, and would require an MSC to change, as it is defined behaviour. The fact that this number is 3, given reasonable optimisation concerns and trade-offs, should be an implementation detail, and not a recommendation. What I'm more worried about is that this sentence weakens the guarantee that all recursively related events are returned, and that a client would have to call this endpoint again with some select events "just to be sure" that they got "all of them", if the client wants to be pedantic. (Which it must, if it wants to display consistent and correct behaviour, when the behaviour of the server is to not return "all" recursively referred events, but only "some", capping off at an arbitrary level) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @clokep You originally suggested the number of 3, what's your opinion on this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I am inclined to agree that it's easy to imagine needing >3 and having to change this. Maybe a number like 10 could be a good minimum - then we'd "know" that usage patterns similar to what exists now would definitely be covered, and the only cases that would not be would be ones that allow arbitrary-length reference chains. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This text was the recommended text (I think from @turt2live) in MSC3771 and MSC3773 and exists in a similar form in the spec today:
tl;dr I'd leave it as is, but maybe the section on why a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Should this not be "at most 3 levels"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the spec is trying to ensure servers do the bare minimum level of recursion here: if they want to do more, they can. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, this is the goal. There are warnings not to infinitely recurse, but we wanted implementations to have a minimum required recursion amount. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks - in that case I'd say this is fine as-is then. Infinite recursion is warned against elsewhere and we don't need to tell people exactly how to write their servers. That said, I have to admit I'm struggling to understand the meaning of @ShadowJonathan 's original comment so please correct me if I've misunderstood. |
||
> | ||
> One of: `[true false]`. | ||
|
||
In order to be backwards compatible the `recurse` parameter must be | ||
optional (defaulting to `false`). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if existing client implementations are good enough to face the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe, but personally I'd rather just flip the flag in the client once I know it worked that go scouting for what clients implement it and try to make sure they all behave sensibly when the server breaks (old) spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My main concern here is that we may extend the API surface for no actual use, with one option ( |
||
|
||
In this situation, topological ordering is intended to refer to the same | ||
ordering that would be applied to the events when requested via the `/messages` | ||
api given the same `dir` parameter. | ||
|
||
Regardless of the value of the `recurse` parameter, events will always be | ||
returned in topological ordering. Pagination and limiting shall also apply to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem statement at the beginning of this proposal says "chronological order is necessary" to correctly evaluate read receipts. How will this proposal help address this when the API always returns events in topological order? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't feel like "chronological order" is needed at all and I think it just might be a mistake in wording since What read receipts probably care about is Good reading: https://github.com/matrix-org/synapse/blob/develop/docs/development/room-dag-concepts.md#depth-and-stream-ordering In terms of a solution to always choose the correct event for the read receipt, maybe we only construct timelines from AFAICT, this MSC improves read receipts on threads because it makes it easier to get the list of events for a given thread in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks a ton Eric! 🙏 There's another recent proposal, MSC4033, that attempts to solve the ordering problem more holistically which might partially obsolete this one (as far as the ordering problematic is concerned). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 Thanks for cross-linking! MSC4033 is definitely tackling the right problem and exposes the same confusion I had while trying to draft my last reply. The spec is not clear what order read receipts use, I assumed it was In any case, the problem of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Likewise I've removed order-related stuff from the equation here, as I think it was a distraction, and in any case this flag should only change whether this API includes related events or not. It shouldn't affect the order. |
||
topological ordering. | ||
|
||
Filters specified via `event_type` or `rel_type` will be applied to leaves | ||
and intermediate events on the graph formed by the related events. | ||
Events that would match the filter, but whose only relation to the original | ||
given event is through a non-matching intermediate event, will not be included. | ||
|
||
## Potential issues | ||
|
||
Naive implementations might be tempted to provide support for this parameter | ||
through a thin shim which is functionally identical to the client doing | ||
separate recursive `/relations` requests itself. This is ill-advised. | ||
|
||
Such an implementation would, given a specifically crafted set of events, | ||
allow a client to cause unreasonable load. | ||
|
||
## Alternatives | ||
justjanne marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
1. Clients could fetch all relations recursively client-side, which would | ||
increase network traffic and server load significantly. | ||
2. A new, specialised endpoint could be created for threads, specifically | ||
designed to present separate timelines that, in all other ways, would | ||
behave identically to `/messages`. | ||
3. Twitter-style threads (see [MSC2836]). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I proposed aggregations (in the mess that is MSC1849) the intention was basically to define a graph of relations linking events together:
I assumed that when users wanted to do complicated graph queries (e.g. "give me all the events in this thread, including their aggregations") or "give me the siblings of this twitter thread") we'd end up with a generic API for navigating the relations graph - and that's what kegan & I came up with in [MSC2836]. Personally, my hunch is still that a single generic API for navigating the relation graph would have been a better solution than having lots of incremental MSCs which gradually nudge the existing APIs in the right direction. However, we've ended up down the incrementalist path anyway now, and this MSC seems a good enough way to solve the immediate problem of "give me all the events which relate to this thread". However, I have a horrible feeling that we're promptly going to hit another limitation for another relations use case (e.g. "I want to see the last 20 beacon locations for the 5 most recent location-shares in this thread. Including if the location-share event subsequently got edited" - or a similar one for VoIP signalling using aggregations, or even MSC4016-style file transfer updates via aggregations)... and find ourselves wanting a more generic API, again. I'm also dubious about "the server may just ignore your recursion request, impose an arbitrary depth, and the client will have to figure it out". I'd have thought it would be better for the client to state its requirements (i.e. "i need depth=3 for my use case") and for the server to fulfil those... but to cap resource usage by the server then batching up the responses in whatever size it's most comfortable with (or by capping the pagination requests that the client makes). So even if the client asks for some horribly inefficient query, the server can decline to calculate more than 10 results at a time or whatever. Otherwise it's passing a lot of complexity onto the client to work around the server's limitations, and the design of Matrix has always been for the server to do the heavy-lifting rather than the client, in order to make clients as easy to write as possible. So, TL;DR: I don't love this, but it's an acceptable solution for where we're at right now and I'm not going to block it, lest perfect be the enemy of good. In future I really hope we converge on a robust generic way in a new MSC (or MSC2836 gets dusted off) in order to walk the relations DAG rather than sprouting incremental tweaks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel fairly similarly about this really in that much more advanced uses of relations will likely require a more sophisticated API that can properly control recursion depth and what rel_types get recursively returned, etc. I also think the way we've chosen to implement features, ie. with threads as a live feature in Matrix now that, apart from not working very well itself is also causing other parts of the protocol to be flakey: we need to be fairly tactical here. Obviously we can introduce a more full-featured API separately but equally it's a shame that servers will have both APIs to maintain. |
||
4. Alternatively a `depth` parameter could have been specified, as in [MSC2836]. | ||
We believe that a customizable depth would add unnecessary constraints to | ||
server implementers, as different server implementations may have different | ||
performance considerations and may choose different limits. Additionally, | ||
the maximum currently achievable depth is still low enough to avoid this | ||
becoming an issue. | ||
|
||
## Security considerations | ||
|
||
None. | ||
dbkr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Examples | ||
|
||
Given the following graph: | ||
|
||
```mermaid | ||
flowchart RL | ||
subgraph Thread | ||
G | ||
E-->D | ||
B-->A | ||
end | ||
B-.->|m.thread|A | ||
G-.->|m.thread|A | ||
E-.->|m.annotation|B | ||
D-.->|m.edit|A | ||
G-->F-->E | ||
D-->C-->B | ||
``` | ||
|
||
`/messages` with `dir=f` would | ||
return `[A, B, C, D, E, F, G]`. | ||
|
||
`/relations` on event `A` with `rel_type=m.thread` and `dir=f` would | ||
return `[B, G]`. | ||
|
||
`/relations` on event `A` with `recurse=true` and `dir=f` would | ||
return `[B, D, E, G]`. | ||
|
||
`/relations` on event `A` with `recurse=true`, `dir=b` and `limit=2` would | ||
return `[G, E]`. | ||
|
||
`/relations` on event `A` with `rel_type=m.annotation`, | ||
`event_type=m.reaction` and `recurse=true` would return `[G, E]`. | ||
|
||
## Unstable prefix | ||
|
||
Before standardization, `org.matrix.msc3981.recurse` should be used in place | ||
of `recurse`. | ||
|
||
[MSC2836]: https://github.com/matrix-org/matrix-spec-proposals/pull/2836 | ||
[MSC3771]: https://github.com/matrix-org/matrix-spec-proposals/pull/3771 | ||
[`/relations`]: https://spec.matrix.org/v1.6/client-server-api/#get_matrixclientv1roomsroomidrelationseventid | ||
[`m.reaction`]: https://github.com/matrix-org/matrix-spec-proposals/pull/2677 | ||
[`m.replace`]: https://spec.matrix.org/v1.6/client-server-api/#event-replacements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR I think this is only an issue with thread & reference relations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is saying only thread & reference relations can have second-order relations? In which case I think probably (although I'd be surprised if references ever did) but I'm not sure it makes a huge difference here.