-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: temporal partition rotation automation #17427
Comments
I have several use-cases for this, having native Vitess support for this is a big 👍 About the expected problems:
I would be a fan of running this automatically but having a command to be triggered by some control process would also work. Just my 2 cents. Absolutely fabulous that this is being considered for implementation! |
That already works, just an implicit behavior of existing Online DDL! |
Absolutely. I'm wondering though how will vitess communicate that there's an orphaned rule. Where will the error messages go?
That's an interesting idea. Giving this more thought! |
Further design notes: how would this look like? Config/ruleWe'd have a rule per-table, with these fields:
User facing commandsThe user would have access to commands such as:
Where rules are storedThere are two options:
CREATE TABLE IF NOT EXISTS partition_rotation_rules
(
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`keyspace` varchar(256) NOT NULL,
`shard` varchar(255) NOT NULL,
`mysql_schema` varchar(128) NOT NULL,
`mysql_table` varchar(128) NOT NULL,
`description` text NOT NULL,
`interval_name` varchar(32) NOT NULL,
`interval_mode` tinyint unsigned NOT NULL DEFAULT '0',
`create_disabled` tinyint unsigned NOT NULL DEFAULT '0',
`create_ahead_count` int unsigned NOT NULL DEFAULT '0',
`expire_disabled` tinyint unsigned NOT NULL DEFAULT '0',
`expire_interval_name` varchar(32) NOT NULL,
`expire_interval_count` int unsigned NOT NULL DEFAULT '0',
`only_expire_on_prepare` tinyint unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) Basically map the table to a rule's config. We would then have a copy of a table's rule in each shard. When we create/update/drop rule, we must ensure to submit the change to all shards. This will be done in a similar way to Further notes about rulesEither by How rotation is appliedThis will be a task for the Online DDL executor, a component which already manifests as a state machine, and which routinely observes state of schema migrations. Breakdown: When?There's multiple ways to go about "when to do the rotation". We could:
I'm leaning towards (3) and I'll explain why: suppose we just check the situation every 30min (which is short enough for hourly rotation). Most of the time we will have nothing to do. Once in a while, we will have something to do. Imagine we need to do a daily rotation on table t, and that we need to (a) prepare 7 days ahead, and (b) drop 30 days old partitions. Per #17426, we would just try to see if there's any need to create 7 partitions ahead. Most of the time the answer will be "nope! there's already 7 empty partitions ahead of time". But the first run right after midnight will find that "yes, there are only 6 empty partitions ahead of time, hence we need to create one more". 30 minute interval is naive. We can make the behavior more intelligent, given that create/add ahead partitions
This is an argument for (1) - letting the user choose their preferred time of rotation, or (4) - again throwing the problem at the user. drop/expire partitions
We could do that, but we can employ a more sophisticated approach, utilizing the
Scheduling and executionWhatever the timing is, Online DDL would use This deterministic UUID evaluation conveniently eliminates any concerns for "what if Online DDL tries creating something that already exists due to PRS/ERS" family of questions. Online DDL always ignores migrations with an already existing UUID (and doesn't even submit such migrations). Online DDL will then submit the migration onto itself, ie create a |
I'm proposing that Vitess can automate temporal partition rotation based on user defined rules.
What are temporal partitioned tables?
Tables that are
PARTITIONED BY RANGE
and over temporal (time-based) values. This would be either:PARTITION BY RANGE COLUMNS (col_name)
over a single column, that is eitherDATE
orDATETIME
(technically MySQL also allowsTIME
but that is not so interesting to partition rotation).PARTITION BY RANGE (func(col_name))
WHEREfunc
is one of selected functions such asTO_DAYS
, and the column is again eitherDATE
orDATETIME
.For practical reasons, we will have a predefined set of allowed functions/expressions. For example, we will support
PARTITION BY RANGE (TO_DAYS(my_column))
or likewiseYEAR(my_column)
but we will not supportROUND(SQRT(TO_DAYS(my_column)+3.14))
. We aim for practical operational scenarios. Most users will rotate hourly, daily, possibly weekly, monthly, yearly.What is the proposal?
We will have a per-table rule:
Vitess will periodically look at all the rules, and check all referenced tables. For each table, if applicable, it will generate a sequence of Online DDL migrations, with internal UUID and
in-order-execution
, that ensure the table is in required state. Since it will do this periodically, most of the this will be a no-op since the table will already have all the required future partitions, and will have dropped expired partitions.Where are rules to be stored?
I'm thinking as part of
Keyspace
record intopo
, much like the throttler configuration. The config will be copied fromKeyspace
toSrvKeyspace
as needed, again just like the throttler configuration.What are expected problems?
I'm not sure how to handle errors. For example:
TO_DAYS
expression, making the minimal interval24h
?So where should these errors go? As this is a background operation, there's no occasion to respond to the user with the list of errors.
Alternative approaches
With rules still in place, maybe Vitess should not auto-rotate. Instead, maybe we should have
vtctldclient RotatePartitionedTables
command, that will:The partition analysis should still be made on a per-shard basis, to ensure independence of shards and eventual consistency, much like we delegate all Online DDL changes to shards.
Existing work
schemadiff
PR to analyze temporal range partitioned tables and to generate required creation of ahead-of-time partitions, and purge expired partitions: #17426. This also validates intervals and other constraints.Also related:
The text was updated successfully, but these errors were encountered: