-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metric-engine): introduce index options from metric engine #5374
base: main
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
c26eb4a
to
f8e301a
Compare
3f1c718
to
edea1c6
Compare
@@ -71,6 +71,27 @@ DESC TABLE phy; | |||
| job | String | PRI | YES | | TAG | | |||
+------------+----------------------+-----+------+---------+---------------+ | |||
|
|||
SHOW CREATE TABLE phy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can’t see the __table_id
and __tsid
columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __table_id
and __tsid
are internal columns, and these columns and column options will be generated via the metric engine.
@@ -529,7 +542,8 @@ impl MetricEngineInner { | |||
DATA_SCHEMA_TSID_COLUMN_NAME, | |||
ConcreteDataType::uint64_datatype(), | |||
false, | |||
), | |||
) | |||
.with_inverted_index(false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's clean up
greptimedb/src/metric-engine/src/engine/options.rs
Lines 36 to 39 in 7eaabb3
options.insert( | |
"index.inverted_index.ignore_column_ids".to_string(), | |
IGNORE_COLUMN_IDS_FOR_DATA_REGION.iter().join(","), | |
); |
@@ -142,6 +148,19 @@ impl DataRegion { | |||
|
|||
c.column_id = new_column_id_start + delta as u32; | |||
c.column_schema.set_nullable(); | |||
match index_options { | |||
IndexOptions::Inverted => { | |||
c.column_schema.set_inverted_index(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incompatible change. Once the schema of a certain column is explicitly set to have an inverted key, the condition for indexing the tag by default will be destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... it's ok for the newly created table. But it's problematic to alter these options.
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
#5282
What's changed and what's your intention?
This PR introduces index options to the metric engine, addressing scenarios with high cardinality where inverted index files can grow excessively large. By allowing users to specify index types and parameters, such as skipping indexes with customizable granularity, users can better control index behavior and optimize storage.
For example:
In this case, all auto-added columns will apply the index options specified by the user.
If a logical table is created on the physical table:
The resulting physical table will have the following schema:
PR Checklist
Please convert it to a draft if some of the following conditions are not met.