Core: Add new CommandRequest - Pipeline #2954

shohamazon · 2025-01-15T14:05:10Z

This PR introduces a new CommandRequest type: Pipeline. A Pipeline represents a batch of commands that are sent to Valkey for execution, similar to a Transaction. However, unlike a transaction, a pipeline has the following distinguishing characteristics:

Non-Atomic Execution:
Transactions in Valkey are atomic. Pipelines, in contrast, do not provide such guarantee.
Multi-Node Support:
Transactions are limited to a single Valkey node because all commands within a transaction must belong to the same slot in cluster mode. Pipelines, however, can span multiple nodes, allowing commands to target different slots or involve multi-node commands (e.g., PING or MSET that span multiple keys in different slots). (Where in Transaction this Multi-Node commands would just route to a single node).

Implementation Details

To support the execution of pipelines in cluster mode, this PR introduces several changes:

1. Pipeline Splitting:

Since a pipeline can include commands targeting different slots, it needs to be split into sub-pipelines, each grouped by the node responsible for the relevant slots.
The process involves mapping each command in the pipeline to its corresponding node(s) based on the cluster's slot allocation. This mapping is handled by routing logic, which determines whether a command targets a single node or multiple nodes.

2. Node Communication:

Once the pipeline is split, each sub-pipeline is sent to its respective node for execution.
For commands that span multiple nodes, the implementation ensures the responses are tracked and aggregated to form a cohesive result.

3. Response Aggregation:

To handle multi-node commands, the responses from each node are stored along with the node's address. This allows for proper aggregation and processing of results, particularly when commands require combining responses (e.g., for commands like MGET).

Summary

This PR introduces the Pipeline request type, enabling non-atomic batch command execution in Glide.

Issue link

This Pull Request is linked to issue (URL): [REPLACE ME]

Checklist

Before submitting the PR make sure the following are checked:

This Pull Request is related to one issue.
Commit message has a detailed description of what changed and why.
Tests are added or updated.
CHANGELOG.md and documentation files are updated.
Destination branch is correct - main or release
Create merge commit if merging release branch into main, squash otherwise.

Signed-off-by: Shoham Elias <[email protected]>

ikolomi · 2025-01-16T11:40:55Z

glide-core/src/client/mod.rs

+        pipeline: &'a redis::Pipeline,
+    ) -> redis::RedisFuture<'a, Value> {
+        let command_count = pipeline.cmd_iter().count();
+        let _offset = command_count + 1; //TODO: check


resolve TODO

ikolomi · 2025-01-16T11:56:44Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+            .push((index, inner_index));
+    }
+
+    async fn routes_pipeline_commands(


find a better name and make it shorter

ikolomi · 2025-01-16T11:59:05Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+        pipeline: &crate::Pipeline,
+        core: Core<C>,
+    ) -> RedisResult<(
+        HashMap<String, (Pipeline, C, Vec<(usize, Option<usize>)>)>,


encapsulate in struct

ikolomi · 2025-01-16T12:09:55Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+            match cluster_routing::RoutingInfo::for_routable(cmd) {
+                Some(cluster_routing::RoutingInfo::SingleNode(SingleNodeRoutingInfo::Random))
+                | None => {
+                    if pipelines_by_connection.is_empty() {


ikolomi · 2025-01-16T12:18:20Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+                    } else {
+                        // since the map is not empty, add the command to a random connection within the map.
+                        let mut rng = rand::thread_rng();
+                        let keys: Vec<_> = pipelines_by_connection.keys().cloned().collect();


rename to addresses

think about way not to clone the addresses

glide-core/redis-rs/redis/src/cluster_async/mod.rs

ikolomi · 2025-01-16T12:41:54Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+                    for (index, routing_info, response_policy) in response_policies {
+                        #[allow(clippy::type_complexity)]
+                        // Safely access `values_and_addresses` for the current index
+                        let response_receivers: Vec<(


use structs for complex types

ikolomi · 2025-01-16T12:47:17Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+
+                    // Collect final responses
+                    for mut value in values_and_addresses.into_iter() {
+                        assert_eq!(value.len(), 1);


dont use asserts in prod code

Use index 0 instead of poping and unwrapping

ikolomi · 2025-01-16T12:51:15Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+                        .map_err(|err| (OperationTarget::FanOut, err))?;
+
+                        // Update `values_and_addresses` for the current index
+                        values_and_addresses[index] = vec![(aggregated_response, "".to_string())];


use index 0 for storing aggregated_response

ikolomi · 2025-01-16T12:56:40Z

glide-core/src/socket_listener.rs

Lets try to use pipeline for both transaction and pipeline, differentiating by is_atomic

ikolomi · 2025-01-16T12:58:15Z

glide-core/src/protobuf/command_request.proto

@@ -501,6 +501,10 @@ message Transaction {
    repeated Command commands = 1;
 }

+message Pipeline {


lets remove Transaction and use Pipeline + is_atomic flag

Please also add comments there describing things there

From protocol's view the only difference between pipeline and transaction is 2 extra commands MULTI and EXEC

also being atomic + multi slots enabled

barshaul

Some initial notes (note to self - got up to handle_single_node_route)

barshaul · 2025-01-19T11:10:09Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

-    )
+            })
+    } else {
+        // Pipeline is not atomic, so we can have commands with different slots.


Please add documentation to the function and state that for non atomic pipelines it returns none

barshaul · 2025-01-19T11:29:15Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+    let addresses: Vec<_> = pipeline_map.keys().cloned().collect();
+    let random_address = addresses.choose(&mut rng).unwrap();
+    let context = pipeline_map.get_mut(random_address).unwrap();


Don't use unwrap in production code. If a bug would cause pipeline_map.keys() to be empty it would crash the whole client. Instead, change this function to return a result and return ClientError if no random address is found.

Coping all addresses is redundant, you can achieve the same with:

let mut rng = rand::thread_rng(); if let Some(node_context) = pipeline_map .values_mut() .choose(&mut rng) { node_context.add_command(cmd, index, None); Ok(()) } else { // return error }

barshaul · 2025-01-19T11:37:10Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+    // This function handles commands with routing info of MultiSlot (like MSET or MGET), creates sub-commands for the matching slots and add it to the correct pipeline
+    async fn handle_multi_slot_routing(


The name of the function should declare that it's only relevant for pipelines, the current name is misleading

barshaul · 2025-01-19T11:38:21Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+            };
+            if let Some((address, conn)) = conn {
+                let new_cmd =
+                    crate::cluster_routing::command_for_multi_slot_indices(cmd, indices.iter());


add use crate::cluster_routing::command_for_multi_slot_indices and remove the prefix

barshaul · 2025-01-19T11:48:42Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+        }
+    }
+
+    fn determine_internal_routing(


missing description, it ain't clear when and why it would be used

barshaul · 2025-01-19T11:54:08Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+    // This function handles commands with routing info of SingleNode
+    async fn handle_single_node_route(


same issue - all of these functions are placed under ClusterConnInner though they are only relevant for pipelines. I'm not sure there is a good reason placing them here. Why not moving all of the pipelines logic into a separate file (e.g. pipeline, pipeline_routing) under the async_cluster folder?

Signed-off-by: Shoham Elias <[email protected]>

BoazBD · 2025-01-21T10:40:03Z

Ignore the review request ^, added by accident. 🙂

barshaul

This PR is long lol. Most comments refer to readability of the code, too complex types and redundant 'pub' declarations, please fix in all required places - not only where I commented.
Will continue tomorrow

barshaul · 2025-01-21T14:54:44Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

@@ -2125,7 +2092,7 @@ where
            .map_err(|err| (address.into(), err))
    }

-    async fn try_pipeline_request(
+    pub async fn try_pipeline_request(


why pub? pub refer to user-facing APIs

see if it can be removed or if pub(crate) is enough

barshaul · 2025-01-21T15:31:33Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

@@ -2180,7 +2233,7 @@ where
        }
    }

-    async fn get_connection(
+    pub async fn get_connection(


same - function shouldn't be pub

barshaul · 2025-01-21T15:49:53Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

@@ -2139,6 +2106,38 @@ where
            .map_err(|err| (OperationTarget::Node { address }, err))
    }

+    /// Aggregates responses for multi-node commands and updates the `values_and_addresses` vector.


Suggested change

/// Aggregates responses for multi-node commands and updates the `values_and_addresses` vector.

/// Aggregates pipeline responses for multi-node commands and updates the `values_and_addresses` vector.

barshaul · 2025-01-21T15:53:36Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+    /// - It collects responses and their source node addresses from the corresponding entry in `values_and_addresses`.
+    /// - Uses the routing information and optional response policy to aggregate the responses into a single result.
+    ///
+    /// The aggregated result replaces the existing entries in `values_and_addresses` for the given command index.


I read the description 3 times and it's still unclear to me what this function does :(
maybe a simple example would help.

barshaul · 2025-01-21T15:55:12Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+    /// - It collects responses and their source node addresses from the corresponding entry in `values_and_addresses`.
+    /// - Uses the routing information and optional response policy to aggregate the responses into a single result.
+    ///
+    /// The aggregated result replaces the existing entries in `values_and_addresses` for the given command index.


what do you mean by "replaces the existing entries in values_and_addresses for the given command index."? do you mean that it sorts the entries in values_and_addresses by the command indices calculated in this function? try to make it clearer

barshaul · 2025-01-21T16:10:39Z

glide-core/redis-rs/redis/src/cluster_async/pipeline_routing.rs

+pub async fn execute_pipeline_on_node<C>(
+    address: String,
+    node_context: NodePipelineContext<C>,
+) -> Result<(Vec<(usize, Option<usize>)>, Vec<Value>, String), (OperationTarget, RedisError)>


Please create type aliases for the return type: (Vec<(usize, Option)>, Vec, String), it isn't readable

barshaul · 2025-01-21T16:13:14Z

glide-core/redis-rs/redis/src/cluster_async/mod.rs

+                    // might produce multiple responses, each from a different node. By storing the responses with their
+                    // respective node addresses, we ensure that we have all the information needed to aggregate the results later.
+                    // This structure is essential for handling scenarios where responses from multiple nodes must be combined.
+                    let mut values_and_addresses = vec![Vec::new(); pipeline.len()];


we can make this complex type a bit clearer with type aliases:

// define on top type NodeResponse = (Value, String); // A response with its source node address. type PipelineResponses = Vec<Vec<NodeResponse>>; // Outer Vec: pipeline commands, Inner Vec: (response, address). ... let mut values_and_addresses: PipelineResponses = vec![Vec::new(); pipeline.len()];

and it can also be used elsewhere (e.g. aggregate_pipeline_multi_node_commands)

barshaul · 2025-01-21T16:15:00Z

glide-core/redis-rs/redis/src/cluster_async/pipeline_routing.rs

+#[allow(clippy::type_complexity)]
+pub async fn collect_pipeline_tasks(
+    join_set: &mut tokio::task::JoinSet<
+        Result<(Vec<(usize, Option<usize>)>, Vec<Value>, String), (OperationTarget, RedisError)>,


same-type alias

then remove #[allow(clippy::type_complexity)]

barshaul · 2025-01-21T16:28:57Z

glide-core/redis-rs/redis/src/cluster_async/pipeline_routing.rs

+/// - `Ok(Some((OperationTarget, RedisError)))`: If one or more tasks encountered an error, returns the first error.
+/// - `Ok(None)`: If all tasks completed successfully.


since returning Ok with Some(err) is confusing, you can make it more readable with some enum representing the return values, something like

enum MultiPipelineResult { /// All tasks completed successfully. AllSuccessful, /// Some tasks failed, returning the first encountered error and the associated operation target. FirstError { target: OperationTarget, error: RedisError, }, }

shohamazon and others added 4 commits January 15, 2025 14:04

Core: Add new CommandRequest - Pipeline

73d677f

Signed-off-by: Shoham Elias <[email protected]>

clippy

e8fe817

Signed-off-by: Shoham Elias <[email protected]>

few changes

4759d75

Signed-off-by: Shoham Elias <[email protected]>

Merge branch 'main' into pipelines

0419414

ikolomi self-requested a review January 16, 2025 10:53

shohamazon added Rust core redis-rs/glide-core matter Core changes Used to label a PR as PR with significant changes that should trigger a full matrix tests. labels Jan 16, 2025

shohamazon marked this pull request as ready for review January 16, 2025 10:55

shohamazon requested a review from a team as a code owner January 16, 2025 10:55

shohamazon requested review from barshaul and avifenesh January 16, 2025 11:30

ikolomi reviewed Jan 16, 2025

View reviewed changes

glide-core/redis-rs/redis/src/cluster_async/mod.rs Show resolved Hide resolved

ikolomi reviewed Jan 16, 2025

View reviewed changes

glide-core/redis-rs/redis/src/cluster_async/mod.rs Show resolved Hide resolved

ikolomi reviewed Jan 16, 2025

View reviewed changes

glide-core/redis-rs/redis/src/cluster_async/mod.rs Show resolved Hide resolved

ikolomi reviewed Jan 16, 2025

View reviewed changes

Yury-Fridlyand requested a review from jonathanl-bq January 16, 2025 19:17

barshaul requested changes Jan 19, 2025

View reviewed changes

move all functions to seperate file

83eb383

Signed-off-by: Shoham Elias <[email protected]>

shohamazon force-pushed the pipelines branch from 4ec2034 to 83eb383 Compare January 20, 2025 12:18

BoazBD requested review from barshaul and ikolomi January 21, 2025 10:37

barshaul requested changes Jan 21, 2025

View reviewed changes

		// This function handles commands with routing info of MultiSlot (like MSET or MGET), creates sub-commands for the matching slots and add it to the correct pipeline
		async fn handle_multi_slot_routing(

		// This function handles commands with routing info of SingleNode
		async fn handle_single_node_route(

	/// Aggregates responses for multi-node commands and updates the `values_and_addresses` vector.
	/// Aggregates pipeline responses for multi-node commands and updates the `values_and_addresses` vector.

		/// - `Ok(Some((OperationTarget, RedisError)))`: If one or more tasks encountered an error, returns the first error.
		/// - `Ok(None)`: If all tasks completed successfully.

Core: Add new CommandRequest - Pipeline #2954

Are you sure you want to change the base?

Core: Add new CommandRequest - Pipeline #2954

Conversation

shohamazon commented Jan 15, 2025 • edited Loading

Implementation Details

Summary

Issue link

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barshaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barshaul Jan 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BoazBD commented Jan 21, 2025

barshaul left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barshaul Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

shohamazon commented Jan 15, 2025 •

edited

Loading

barshaul Jan 19, 2025 •

edited

Loading

barshaul left a comment •

edited

Loading

barshaul Jan 21, 2025 •

edited

Loading