Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document BigQuery RPC settings #24638

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

findinpath
Copy link
Contributor

Description

Documents the settings from the https://github.com/trinodb/trino/blob/29ffc6c1c7d93bbd22ccdccb98ba39a5fab1fd68/plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryRpcConfig.java config class.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jan 7, 2025
@github-actions github-actions bot added the docs label Jan 7, 2025
@findinpath findinpath requested review from ebyhr and pajaks January 7, 2025 11:57
@ebyhr
Copy link
Member

ebyhr commented Jan 7, 2025

I think we should remove @ConfigHidden annotation from BigQueryRpcConfig if we document these properties.
cc: @wendigo

@findinpath
Copy link
Contributor Author

I think we should remove @ConfigHidden annotation from BigQueryRpcConfig if we document these properties.

Yes, waiting feedback from your side @ebyhr on whether we should do this before proceeding with this PR.

used for gRPC communication.
- `1`
* - `bigquery.channel-pool.min-size`
- The absolute minimum size of the connection pool, also known as a channel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The absolute minimum size of the connection pool, also known as a channel
- The minimum size of the connection pool, also known as a channel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also ... size in what .. maybe better

The minimum number of connections in the connection pool, ...

pool, used for gRPC communication.
- `1`
* - `bigquery.channel-pool.max-size`
- The absolute maximum size of the connection pool, also known as a channel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The absolute maximum size of the connection pool, also known as a channel
- The maximum size of the connection pool, also known as a channel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above also

@@ -177,6 +177,59 @@ a few caveats:
- Enable using Apache Arrow serialization when reading data from BigQuery.
Please read this [section](bigquery-arrow-serialization-support) before using this feature.
- `true`
* - `bigquery.channel-pool.initial-size`
- The initial size of the connection pool, also known as a channel pool,
used for gRPC communication.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we explain gRPC and RPC somewhere earlier in the doc? If not .. we might want to .. or at least link to some external doc

@@ -177,6 +177,59 @@ a few caveats:
- Enable using Apache Arrow serialization when reading data from BigQuery.
Please read this [section](bigquery-arrow-serialization-support) before using this feature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the "Please" here .. in a separate commit if you want

- `1`
* - `bigquery.channel-pool.min-rpc-per-channel`
- Threshold to start scaling down the channel pool.
When the average of the maximum number of outstanding RPCs in a single
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify to

Suggested change
When the average of the maximum number of outstanding RPCs in a single
When the average of outstanding RPCs in a single

- The maximum number of retry attempts to perform for the RPC calls.
If this value is set to `0`, the logic will instead use the
`bigquery.rpc-timeout` configuration value to determine retries.
In the event that both the `bigquery.rpc-retries` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too many "both"

In the event that both the `bigquery.rpc-retries` and
`bigquery.rpc-timeout` values are both `0`, the logic will not retry.
If this value is positive, and the number of attempts exceeds
`bigquery.rpc-retries` limit, the logic will give up retrying even if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as preceding

* - `bigquery.rpc-timeout`
- Timeout [duration](prop-type-duration) on when the retries for the
RPC call should be given up completely. The higher the timeout, the
more retries can be attempted. If this value is `0s`, then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the value from bigquery.rpc-retries is used.

RPC call should be given up completely. The higher the timeout, the
more retries can be attempted. If this value is `0s`, then
the logic will instead use `bigquery.rpc-retries` to determine retries.
In the event that `bigquery.rpc-retries` and `bigquery.rpc-timeout`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry is deactivated when both .. and .. are 0.

In the event that `bigquery.rpc-retries` and `bigquery.rpc-timeout`
values are both `0`, the logic will not retry.
If this value is positive, and the retry duration has reached the timeout
value, the logic will give up retrying even the number of attempts is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
value, the logic will give up retrying even the number of attempts is
value, retries stop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants