Valkey Proxy Proposal #11

hwware · 2024-10-18T07:56:14Z

No description provided.

vongosling · 2024-10-29T11:35:53Z

ValkeyProxy.md

+## Abstract  (Required)
+
+The proposed ValkeyProxy is to provide the features similar to redis-cluster-proxy. 
+


The proposed ValkeyProxy is to accept connections from clients and multiplexes them onto connections to the backend server processes. It is horizontally scalable and multi-tenant.

It serves several main purposes beyond simply routing requests to the right processes.

It provides scalability by offloading connection overhead from the backend server processes. It also could hold client connections open even as the server processes are being migrated or restarted. this is very helpful in cloud-native situations, our control over various open source sdks behavior can be extremely limited. For scenarios with lager data volumes, we could provide read-write separation mode through the proxy.

what's more, we could add stateless, secure HTTP connections directly, enhanced connection observability and so on.

ValkeyProxy.md

vongosling · 2024-10-29T12:29:56Z

ValkeyProxy.md

+        For some time-consuming or high CPU cost commands, we can make statistics in Proxy side, and for some results, such as min/max/avg/p95/p99 from memtier_benchmark or valkey_benchmark, clients can get timely statistical information.
+
+
+6.	Support audit log 


Cool, this is a hook in my opinion

Why wouldn't we implement audit logs like this into the server though?

On benefit implemented in proxy side is to retrieve this information from one point, user do not need to fetch the log in different nodes in the cluster mode.

madolson

I'm fairly dubious of proxies in Valkey as a general solution. They do solve problems at scale related to number of connections, but I think many of the problems can also be solved by improving clients.

There are some cool things that could be solved though, like implementing meta-commands (like distributed locks, sorted sets, etc) which is also very interesting. This isn't exactly the same as your proposal, since you seem to be proposing almost API compatibility with the core.

madolson · 2024-11-13T01:53:43Z

ValkeyProxy.md

+
+## Design Considerations  (Required)
+
+1.	Support all existing Valkey commands. 


I would like to bring up that some commands are intrinsically hard to support like lua scripts and pub/sub. Do we need to necessarily implement all commands?

Even not all, but most is necessary, IMO

ValkeyProxy.md

madolson · 2024-11-13T01:56:57Z

ValkeyProxy.md

+
+# ValkeyProxy RFC
+
+## Abstract  (Required)


One of the things I'm generally unsure of is whether or not we really a proxy compared to just making clients better. It's sort of been the project of glide to abstract away as much of the differences you have outlined here.

As an aside, we might be able to use the core of glide to implement some of this logic.

madolson · 2024-11-13T01:57:40Z

ValkeyProxy.md

+        For some time-consuming or high CPU cost commands, we can make statistics in Proxy side, and for some results, such as min/max/avg/p95/p99 from memtier_benchmark or valkey_benchmark, clients can get timely statistical information.
+
+
+6.	Support audit log 


Why wouldn't we implement audit logs like this into the server though?

madolson · 2024-11-13T01:58:45Z

ValkeyProxy.md

+8.	Provide a connection pool to facilitate client connection.
+
+        Connection pooling can reduce the overhead associated with opening and closing connections and with keeping many 
+connections open simultaneously. This overhead not only includes memory needed to handle each new connection, also involves 
+CPU overhead to close each connection and open a new one. 


I think connection pooling is done fairly effectively on the clients today. The only real drawback I've seen is when we have large clusters, so each client has to maintain a very large number of outbound connections

madolson · 2024-11-13T01:59:43Z

ValkeyProxy.md

+
+        For the client's persistent connection, the connection will not actively disconnect, but only returns the error message; If the client is disconnected actively, the connection will be disconnected immediately.
+
+9.	Support CLIENT LIST etc commands aggregate return: return the client of all proxy nodes.


Aggregate commands are cool, we've also heard folks interested in running flushall across the whole cluster. Configuration like config set might also be possible.

madolson · 2024-11-13T02:01:56Z

ValkeyProxy.md

+7.      Provide reading/writing separation features
+
+        In production environments, most scenarios involve more reading requests and less writing requests. By enabling
+read-write separation feature in the Proxy, we can let the primary node focus on processing write operations and let the 
+replica handle read operations. It is very suitable for scenarios with relatively large read operations and can reduce the 
+pressure on the primary.


Ooh, this can go horribly wrong. We talked with a customer that was doing caching and did read-write splits, they were having cache misses on replicas but the writes were going to the primary, but the replica was slow at consuming so it would miss again and trigger another write. I think generally read/write splitting is a big anti-pattern for online transactional workloads like caching or message queues. it makes more sense for analytic workloads, but I don't Valkey really serves much analytics today. Maybe that will change

The reason we want to implement this feature in ValkeyProxy is:

This feature (read-write separation) could be enabled or disabled through config parameter. And by integrating with ACL, this feature only applies to some specific users

For those data with a relatively low update frequency, read misses due to differences in primary and replica data, which in turn lead to multiple writes, are acceptable in some scenarios.

If data consistency guarantees are a must following the reason 2 case, synchronous between the master and replica can be considered.

madolson

From weekly meeting @PingXie, should we support cross-slot operations (including cross shard operations) . Will we support that functionality and what type of guarantee will provide with that.

Signed-off-by: hwware <[email protected]>

PingXie · 2024-12-19T02:45:57Z

Thank @hwware for putting together this proposal.

From weekly meeting @PingXie, should we support cross-slot operations (including cross shard operations) . Will we support that functionality and what type of guarantee will provide with that.

My primary concern here is that simulating standalone behavior on a cluster creates a third mode that is neither standalone nor cluster. Losing atomicity is one of the differences, and then there are pub/sub, streams, scan, eval, functions, etc., to consider. I feel we might end up creating a project that is either 10x more complex or has a much narrower scope (no atomicity, no pub/sub, etc.).

Networking overhead also plays a huge role in in-memory caching performance. For KV caching workloads, where compatibility is less of a concern, adding a network hop would have a very significant impact.

Moreover, the motivations outlined in the RFC don’t necessarily require a proxy to deliver. For example, as @madolson pointed out, Glide already addresses some of these needs with the added potential for end-to-end observability and audit logging.

It seems like the real need here might be a better cluster client rather than a proxy layer. A robust client could solve many of these challenges without introducing unnecessary complexity or performance trade-offs.

hwware force-pushed the ValkeyProxy branch from 6a66bbf to 3175aa5 Compare October 18, 2024 07:57

hwware changed the title ~~Valkey Proxy Proposal~~ WIP--Valkey Proxy Proposal Oct 18, 2024

hwware force-pushed the ValkeyProxy branch 2 times, most recently from 67d1a2a to 7b148e5 Compare October 23, 2024 14:21

vongosling approved these changes Oct 29, 2024

View reviewed changes

vongosling reviewed Oct 29, 2024

View reviewed changes

hwware force-pushed the ValkeyProxy branch from 913a310 to c4cfc5f Compare November 8, 2024 08:28

hwware changed the title ~~WIP--Valkey Proxy Proposal~~ Valkey Proxy Proposal Nov 8, 2024

hwware marked this pull request as ready for review November 8, 2024 08:29

hwware added the ready-for-discussion label Nov 8, 2024

madolson reviewed Nov 13, 2024

View reviewed changes

madolson reviewed Nov 19, 2024

View reviewed changes

hwware added 5 commits December 18, 2024 17:34

Initial Draft

4dabd47

Signed-off-by: hwware <[email protected]>

Update Design

69eb9b0

Signed-off-by: hwware <[email protected]>

Second commit

4a5ebde

Signed-off-by: hwware <[email protected]>

Add 2 more items for ValkeyPorxy features

8700c1f

Signed-off-by: hwware <[email protected]>

update doc and remove multiply db section

1183c8e

Signed-off-by: hwware <[email protected]>

hwware force-pushed the ValkeyProxy branch from fd3c085 to 1183c8e Compare December 18, 2024 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Valkey Proxy Proposal #11

Valkey Proxy Proposal #11

hwware commented Oct 18, 2024

vongosling Oct 29, 2024 •

edited

Loading

vongosling Oct 29, 2024

madolson Nov 13, 2024

hwware Nov 14, 2024

madolson left a comment

madolson Nov 13, 2024

hwware Nov 15, 2024

madolson Nov 13, 2024

madolson Nov 13, 2024

madolson Nov 13, 2024

madolson Nov 13, 2024

madolson Nov 13, 2024

hwware Nov 15, 2024

madolson left a comment •

edited

Loading

PingXie commented Dec 19, 2024

		## Abstract (Required)

		The proposed ValkeyProxy is to provide the features similar to redis-cluster-proxy.

		For some time-consuming or high CPU cost commands, we can make statistics in Proxy side, and for some results, such as min/max/avg/p95/p99 from memtier_benchmark or valkey_benchmark, clients can get timely statistical information.


		6. Support audit log


		## Design Considerations (Required)

		1. Support all existing Valkey commands.


		For the client's persistent connection, the connection will not actively disconnect, but only returns the error message; If the client is disconnected actively, the connection will be disconnected immediately.

		9. Support CLIENT LIST etc commands aggregate return: return the client of all proxy nodes.

Valkey Proxy Proposal #11

Are you sure you want to change the base?

Valkey Proxy Proposal #11

Conversation

hwware commented Oct 18, 2024

vongosling Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madolson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madolson left a comment • edited Loading

Choose a reason for hiding this comment

PingXie commented Dec 19, 2024

vongosling Oct 29, 2024 •

edited

Loading

madolson left a comment •

edited

Loading