Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup and update Mining Protocol specs #98

Merged
merged 9 commits into from
Dec 10, 2024

Conversation

plebhash
Copy link
Contributor

@plebhash plebhash commented Sep 5, 2024

close #96

cleanup and update Mining Protocol specs

The current state of the Mining Protocol Specs leave some Key Concepts left for the reader to infer.

Moreover, there are visual references with out-of-date information.

This PR aims to bring more clarity to the Mining Protocol Specs by:

  • explicitly defining some Key Concepts
  • providing some up-to-date visual aids

New Key Concept definitions

  • Job (Standard, Extended, Future, Custom)
  • Extended Extranonce

Considerations

  • This PR does not aim to re-write the Mining Protocol Specs. All semantics should remain fully aligned with the original ideas coming from the Spec authors.

  • This PR does not aim to introduce big changes under 5.4 Mining Protocol Messages. Even though I belive this section deserves a cleanup in terms of formatting standards, I want to limit the scope of this PR to making things more clear with regards to conceptual clarity on the Mining Protocol Specs.

  • Some Key Concepts are referred to before they are formally introduced. This is a bit of a conceptual "chicken-and-egg" problem and I couldn't really find an optimal arrangement for the doc without making some compromises. I believe the original Spec authors faced similar challenges, and in the end of the day, we should expect the reader to go over multiple iterations of reading over the doc before getting a full grasp over the ideas being presented.

  • Reviewing diff lines on Markdown Syntax can be hard, especially when it comes to seeing the visual aids side-to-side with text. I would highly recommend reviewers to refer to this rendering on my fork.

Ideas for follow-up PRs

  • implement formatting standards across the entire doc (hopefully in congruence with all Markdown files under this repository).
  • add some flowcharts illustrating Message flows across different Roles under the Mining Protocol.

@plebhash plebhash force-pushed the update-mining-protocol branch 6 times, most recently from 11ae8cd to 6bb2c1c Compare September 5, 2024 22:21
@plebhash plebhash changed the title cleanup Mining Protocol specs cleanup and update Mining Protocol specs Sep 5, 2024
@plebhash plebhash force-pushed the update-mining-protocol branch 3 times, most recently from e481ac2 to 3e986cd Compare September 5, 2024 23:50
@Sjors
Copy link
Contributor

Sjors commented Sep 6, 2024

I find the translator diagram a bit too visually cluttered.

Since the information for the second and third device is duplicated, you could just leave it out:

Scherm­afbeelding 2024-09-06 om 09 10 42

Plus a small visual hint that the pattern is repeated.

@Sjors
Copy link
Contributor

Sjors commented Sep 6, 2024

It would also be useful to add the .dot files you used to generate these graphics, so they can be more easily modified in the future.

Copy link
Collaborator

@GitGab19 GitGab19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @plebhash, I really like it! 👏
I left some comments, and questions which can be topics of some discussion


The size of search space for an extended channel is `2^(NONCE_BITS+VERSION_ROLLING_BITS+extranonce_size*8)` per `nTime` value.
The size of the search space for one Standard Job, given a fixed `nTime` field, is `2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th`, where `NONCE_BITS = 32` and `BIP320_VERSION_ROLLING_BITS = 16`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like more the phrase:
"The size of the search space for one Standard Job is 2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th, per nTime value, where NONCE_BITS = 32 and BIP320_VERSION_ROLLING_BITS = 16."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this, does it mean we should think about the fact that newer machines are already surpassing this 280Th/s limit? (look at S21 hydro or S21 XP hydro)

Copy link
Contributor

@Fi3 Fi3 Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOM is not intended for powerful machines.

Copy link
Contributor Author

@plebhash plebhash Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so this phrase is wrong?

All SV2 mining devices are restricted to Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature.

I wrote it based on my current understanding, and after reaching out to Braiins Firmware Support (Marin) and getting confirmation that SV2 Braiins Firmware is restricted to HOM.

Can some powerful SV2 Mining Device open Extended Channels and receive NewExtendedMiningJob?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

@plebhash plebhash Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a bit confusing to me, because it made me think: "what is the purpose of HOM Standard Jobs after all?"

and based on this question #98 (comment), I believe @GitGab19 also shares this feeling

I thought about @Fi3 answer and here's my current perspective on the trade-offs of Standard Jobs:


rolling Extranonces is an "advanced feature" that makes firmware more complex and harder to maintain, so from a business perspective, it should be avoidable if possible.

so a way to correct the phrase above would be:

All SV2 mining devices with hashrate under 280 TH/s should be restricted to HOM via Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature. Hashing space optimization should be achieved upstream, where efficient telemetry is achieved by keeping track of multiple Standard Jobs via Group Channels.

If the SV2 mining device hashrate is above 280 TH/s, then it should support non-HOM via Extended Jobs.


hopefully the spec authors can confirm this in the near future (when we ask them to review this PR)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HOM can be used for almost any machine, provided it doesn't exceed the hash rate by orders of magnitude.

The device is supposed to receive new mining jobs periodically. Every new job resets the ntime value again to the min_ntime for a given job. So unless the time doesn't roll hours into the future within the usual 30 second window, I don't think it's a problem.

Problem with rolling nTime too much ahead of the true time is equivalent to a problem of "not receiving a new mining job frequently enough".
It's difficult to draw some hard line. It's probably better to be a bit vague about that. Say something like "be sane about nTime rolling or otherwise your work will be rejected".

quick calculation:
Assume we allow nTime to roll 10 minutes into the future for a given mining job that is being received every 30 seconds. In order to get to that point we would need 5.6 PH/s (=280 * 600 / 30). I don't think we are going to see such devices in foreseeable future (simple device controlled by a single controller - more complex devices split across multiple machines need to open an extended channel).

Copy link
Contributor Author

@plebhash plebhash Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clarification!

I'll tag @rrybarczyk here since this topic was part of a discussion we had not long ago, while we went through this PR.

When we had this discussion, my understanding was that HOM over Standard Jobs were designed to cater for Mining Devices with hashrate below 280 TH/s. This clarifies that this is not the case at all. HOM is perfectly suitable for almost any kind of Mining Device, as long as jobs are updated frequently.

But now that leaves me wondering: is there any scenario where Mining Devices would open Extended Channels with an upstream?

Comment on lines 546 to 566
A Mining Proxy can forward multiple Standard Channels to the Upstream, ideally one representing each SV2 Mining Device Downstream.
The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the SetGroupChannel triggered by some specific behaviour?
Or the trigger to send it can be arbitrary defined by pools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main criteria I can think is: all Standard Channels under the same Connection are unified under the same Group Channel

but maybe the spec authors can elaborate on some other scenarios

The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels.
The Upstream sends jobs via `NewExtendedMiningJob` to Group Channels, and if some Downstream is a Mining Device (i.e.: the `SetupConnection` had the `REQUIRES_STANDARD_JOBS` flag) the Mining Proxy converts that message into different `NewMiningJob` messages after calculating the correct Merkle Root based on each Standard Channel's `extranonce_prefix`.

Alternatively, a Mining Proxy could aggregate multiple Donwstream Standard Channels into a single Extended Channel. The Upstream sends Jobs via `NewExtendedMiningJob`, and for each Downstream SV2 Mining Device, the Mining Proxy sends different `NewMiningJob` message, where the Merkle Root is based on the Standard Channel's `extranonce_prefix`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could be the reason why a proxy prefers to open many standard channels (with upstream), which are then grouped in a group channel? Why that proxy is not directly opening an extended channel with upstream?

Copy link
Contributor Author

@plebhash plebhash Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked a similar question when I was looking at some implementation details

stratum-mining/stratum#1145 (reply in thread)

@Fi3 answer was:

it might want to group standard channels from downstream, maybe a pool want to go that way to have telemetry or for whatever other reasons. But you should ask the spec's authors.

Hopefully we will get some reviews by the spec authors here soon, so they could also help further clarify this in more detail.

@plebhash
Copy link
Contributor Author

plebhash commented Sep 6, 2024

It would also be useful to add the .dot files you used to generate these graphics, so they can be more easily modified in the future.

The graphics were generated via draw.io. The closest I could do to your suggestion is to export and commit .drawio.xml files, which could be later imported back into the tool and further edited.

I actually think this is a good idea.

@Sjors
Copy link
Contributor

Sjors commented Sep 6, 2024

commit drawio.xml

That's a good start.

@plebhash plebhash force-pushed the update-mining-protocol branch 7 times, most recently from 69d2474 to 2c058bf Compare September 10, 2024 20:01
@rrybarczyk rrybarczyk self-requested a review September 17, 2024 14:55
@plebhash
Copy link
Contributor Author

review by @mcelrath

plebhash/sv2-spec@update-mining-protocol...mcelrath:sv2-spec:patch-1

dropping this link here so I can dive into it later


### 5.1.4 Future Jobs
- The `extranonce_prefix` bytes are reserved for the upstream layer, where fixed bytes were already established for the Extranonce and are no longer available for rolling or search space splitting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?

do you mean Extended Extranonces as a whole, or extranonce_prefix specifically?

Copy link
Contributor Author

@plebhash plebhash Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.

the wording on 2^32 channels per connection says:

There can theoretically be up to 2^32 open Channels within one Connection.

up to means this is a theoretical ceiling, but it does not imply this capacity will be filled every time... this should be interpreted as "there will never be more than 2^32 Channels per Connection" rather than "there can always be 2^32 Channels per Connection"

the only layer where the 2^32 capacity could ever be used is the most upstream layer (Poor or JDC), and by definition there's no fixed extranonce_prefix bits on this layer

but thanks for the feedback, I updated this phrase to make it more clear:

There can theoretically be up to 2^32 open Channels within one Connection. This is however just a theoretical ceiling, and it does not mean that every Connection will be able to fill this full capacity (maybe the search space has already been narrowed).

An empty future block job or speculated non-empty job can be sent in advance to speedup new mining job distribution.
The point is that the mining server MAY have precomputed such a job and is able to pre-distribute it for all active channels.
The only missing information to start to mine on the new block is the new prevhash.
![](./img/extended_extranonce_proxies.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root

I agree that was misleading. Changed the Mining Device layer to extranonce_prefix, because that is was is actually assigned to each Standard Channel.

Only later, when Standard Jobs are created, the unique Merkle Root is calculated (coinbase_prefix+extranonce_prefix+coinbase_suffix)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?

we concatenate.

- `extranonce_size`: how many bytes are available for the locally reserved and downstream reserved areas of the Extended Extranonce.

And a [Standard Channel](#531-standard-channels) has the following property:
- `extranonce_prefix` the Extended Extranonce bytes that were already allocated by the upstream server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does a standard channel even need extra nonce information if it's only used for header-only mining?

Copy link
Contributor

@Fi3 Fi3 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have less messages to send from the pool to the proxy

Copy link
Contributor Author

@plebhash plebhash Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple standard channels are aggregated into a group channel

when an ExtendedMiningJob message comes into the group channel, it needs to be converted it into multiple NewMiningJob messages

each NewMiningJob message needs to have a unique Merkle Root, otherwise there would be collision in the search space

the uniqueness of each Merkle Root is determined by the static extranonce_prefix of each standard channel (by combining it with coinbase_prefix and coinbase_suffix from the Extended Job that came from upstream)

overall, the trick to assimilate this is to think in terms of proxy layers: usually, there are multiple layers of extended jobs coming from upstream, and only on the last layer they are converted into standard jobs, where Merkle Roots have to be unique

05-Mining-Protocol.md Outdated Show resolved Hide resolved
Each Channel identifies a dedicated mining session associated with an authorized user.
Upstream stratum nodes accept work submissions and specify a mining target on a per-channel basis.

There can theoretically be up to `2^32` open Channels within one Connection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please point me to the part of the codebase where we are implementing the multiplexing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by multiplexing here.

but maybe the channel_logic module of roles_logic_sv2 could provide you some insight (as well as its usage on roles crates)

05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved

A proxy can either transparently allow its clients to open separate Channels with the server (preferred behavior), or aggregate open connections from downstream devices into its own open channel with the server and translate the messages accordingly (present mainly for allowing v1 proxies).
Both options have some practical use cases.
In either case, proxies SHOULD aggregate clients' Channels into a smaller number of Connections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case? At the same time, I have this question: each of these miners (clients) connects to the translator proxy, and each connection is a TCP connection. So how are we achieving multiplexing? My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case?

tProxy makes sure that each SV1 miner is assigned a unique extranonce_prefix

Copy link
Contributor Author

@plebhash plebhash Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.

I see the practical limitation you are pointing out here. However, I think this is easily solvable by stacking multiple proxy layers.

This limitation only manifests if we try to connect more than 10^5 machines into one single proxy. But if they are distributed across multiple proxies, there's no issue.

And if we imagine there's one "funnel" proxy aggregating all of them, and then forwarding that into some upstream layer (maybe a pool or JDC), this upstream layer would still be able to have up to 2^32 channels in a single Connection.

Here's some comments about the practical aspects of this:

@plebhash plebhash force-pushed the update-mining-protocol branch 4 times, most recently from 7ac6b1d to 89eb9bc Compare September 26, 2024 15:13
Comment on lines +1 to +6
In case the `*.png` files on the parent directory need to be adjusted, the `*.drawio.xml` files from this directory
should be imported into the `draw.io` online tool for editing.

After editing:
- export the `.png` file to replace the old one in the parent directory
- export the `.drawio.xml` file to replace the old one on this directory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sjors there you go

#98 (comment)

@plebhash plebhash force-pushed the update-mining-protocol branch from 8379f73 to d25faf4 Compare October 29, 2024 22:37
img/extended_job.png Outdated Show resolved Hide resolved
@plebhash plebhash force-pushed the update-mining-protocol branch 2 times, most recently from d6f69dd to e474030 Compare November 1, 2024 20:07
@pavlenex pavlenex requested a review from jakubtrnka November 5, 2024 17:06
@pavlenex
Copy link
Contributor

pavlenex commented Nov 5, 2024

Hey @jakubtrnka would be great to get your final review on this one. @Fi3 if you get a chance would be great to get your pair of eyes as well. 🙏

05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
@plebhash plebhash force-pushed the update-mining-protocol branch 3 times, most recently from 5415eb5 to e10349e Compare November 16, 2024 00:15
@plebhash plebhash force-pushed the update-mining-protocol branch from e10349e to b04146b Compare November 26, 2024 02:08
@pavlenex pavlenex requested a review from Fi3 November 26, 2024 17:08
@Fi3
Copy link
Contributor

Fi3 commented Nov 28, 2024

ack

Copy link
Contributor

@Shourya742 Shourya742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Minor nits, rest looks consistent with what we discussion we had and diagrams what you shared.

05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
05-Mining-Protocol.md Outdated Show resolved Hide resolved
Standard channels opened within one particular connection can be grouped together to be addressable by a common communication group channel.

Whenever a Standard Channel is created, it is always put into some Group Channel identified by its `group_channel_id`.
Group Channel ID namespace is the same as Channel ID namespace on a particular connection but the values chosen for Group Channel IDs must be distinct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't all channel id's distinct implicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think I see your point

that essentially boils down to the fact that this phrase is somewhat redundant, and could be shortened into this:

Suggested change
Group Channel ID namespace is the same as Channel ID namespace on a particular connection but the values chosen for Group Channel IDs must be distinct.
Group Channel ID namespace is the same as Channel ID namespace on a particular connection.

however, we shouldn't simply imply that Standard Channel ID namespace is unique (one of the main goals of this PR is to completely avoid leaving foundational concepts simply implied)

we should make that explicitly clear by adding a new phrase into the section about Standard Channel

Copy link
Contributor Author

@plebhash plebhash Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed via 38bd130

@plebhash plebhash force-pushed the update-mining-protocol branch from 9a58ede to b74c7a2 Compare December 2, 2024 13:07
plebhash and others added 2 commits December 2, 2024 20:23
singular is better for formal definition of titles

Co-authored-by: bit-aloo <[email protected]>
@plebhash plebhash force-pushed the update-mining-protocol branch from 38bd130 to 4e27a06 Compare December 4, 2024 21:30
@GitGab19 GitGab19 merged commit 52e1fa2 into stratum-mining:main Dec 10, 2024
@plebhash plebhash deleted the update-mining-protocol branch December 11, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

need to cleanup and update Mining Protocol specs
9 participants