Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storagenode,client: introduce LogStreamAppender #433

Closed
3 tasks done
ijsong opened this issue May 4, 2023 · 2 comments · Fixed by #449, #457, #459 or #464
Closed
3 tasks done

storagenode,client: introduce LogStreamAppender #433

ijsong opened this issue May 4, 2023 · 2 comments · Fixed by #449, #457, #459 or #464
Assignees

Comments

@ijsong
Copy link
Member

ijsong commented May 4, 2023

Motivation

Varlog is a storage system for logs that is both ordered and distributed. To ensure that the storage system is scalable and tolerant to faults, the storage layer in Varlog is partitioned. These partitions are called LogStreams, and each LogStream orders log entries sequentially in what we call "Local Order". The log entries are also arranged sequentially across LogStreams in what we call "Global Order".

Certain applications, such as Kov (Kafka on Varlog), only require "Local Order" and can use the Varlog client. However, this client has some limitations, such as blocking operations until prior results are received and only providing synchronous patterns. Supporting asynchronous Append operations is not trivial. See #441.

To address this, a new asynchronous client has been proposed that supports only Append operation to a single log stream - LogStreamAppender.

Design

The LogStreamAppender operates with a single LogStream. It can append new log entries only to a specific LogStream. Although it only works with specific LogStream, it can take advantage of asynchronous patterns, hence, does not block user codes.

// LogStreamAppenderOption configures a LogStreamAppender.
type LogStreamAppenderOption interface {
}

// WithPipelineSize sets request pipeline size. The default pipeline size is
// two. Any value below one will be set to one, and any above eight will be
// limited to eight.
func WithPipelineSize(pipelineSize int) LogStreamAppenderOption {
}

// WithDefaultBatchCallback sets the default callback function. The default callback
// function can be overridden by the argument callback of the AppendBatch
// method.
func WithDefaultCallback(callback AppendBatchCallback) LogStreamAppenderOption {
}

type Varlog interface {
	// NewLogStreamAppender returns a new LogStreamAppender. The argument ctx
	// is used for all lifetimes of the result LogStreamAppender.
	NewLogStreamAppender(ctx context.Context, tpid types.TopicID, lsid types.LogStreamID, opts ...LogStreamAppenderOption) (LogStreamAppender, error)
}

// BatchCallback is a callback function to notify the result of
// AppendBatch.
type BatchCallback func([]varlogpb.LogEntryMeta, error)

// LogStreamAppender is a client only to be able to append to a particular log
// stream.
type LogStreamAppender interface {
	// AppendBatch appends dataBatch to the given log stream asynchronously.
	// Users can call this method without being blocked until the pipeline of
	// the LogStreamAppender is full. If the pipeline of the LogStreamAppender
	// is already full, it may become blocked. However, the process will
	// continue once a response is received from the storage node.
	// On completion of AppendBatch, the argument callback provided by users
	// will be invoked. All callback functions registered to the same
	// LogStreamAppender will be called by the same goroutine sequentially.
	// Therefore, the callback should be lightweight. If heavy work is
	// necessary for the callback, it would be better to use separate worker
	// goroutines.
	// The only error from the AppendBatch is ErrClosed, which is returned when
	// the LogStreamAppender is already closed. It returns nil even if the
	// underlying stream is disconnected and notifies errors via callback.
	AppendBatch(dataBatch [][]byte, callback BatchCallback) error

	// Close closes the LogStreamAppender client. Once the client is closed,
	// calling AppendBatch will fail immediately. If AppendBatch still waits
	// for room of pipeline, Close will be blocked. It also waits for all
	// pending callbacks to be called.
	Close()
}

NewLogStreamAppender returns a new client for LogStreamAppender, which provides asynchronous Append APIs for the particular log stream specified by the argument tpid and lsid.
The result client offers only Append to the specific log stream. When the client triggers multiple operations in a goroutine, the appended log entries are kept in the operation order.
Different clients connected to the same log stream can not follow the order. Even if two clients, c1 and c2, connected to the same log stream, call Append interleaved in a single goroutine, it cannot guarantee "Local Order" in two clients' operations.

Challenges

These are the challenges we need to address:

  • What is the potential performance improvement we can achieve with this enhancement?
  • How do we accurately measure the performance gain?
  • When utilizing an internal buffer, how can we determine the optimal parameters?
  • What are some ideal uses for our clients?

Tasks

@ijsong ijsong self-assigned this May 4, 2023
@ijsong
Copy link
Member Author

ijsong commented May 4, 2023

@hungryjang, let's discuss an asynchronous client for "Local Order".

@ijsong ijsong changed the title client: introduce LocalOrderClient storagenode,client: introduce LocalOrderClient May 10, 2023
ijsong added a commit that referenced this issue May 16, 2023
This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
@ijsong
Copy link
Member Author

ijsong commented May 17, 2023

There are two ways to notify a result of an asynchronous call - callback and channel. We can refer to some examples:

Providing only blocking calls and leaving the asynchronous implementation call up to the user is also a good approach, but we will not consider it.

Here are great talks:

@ijsong ijsong changed the title storagenode,client: introduce LocalOrderClient storagenode,client: introduce LogStreamAppender May 18, 2023
ijsong added a commit that referenced this issue May 23, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. As a result of experimentations, this PR showed little overhead.
This change uses [reader-biased mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of
built-in RWMutex to avoid shared lock contention.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue May 23, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. As a result of experimentations, this PR showed little overhead.
This change uses [reader-biased mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of
built-in RWMutex to avoid shared lock contention.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue May 24, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. As a result of experimentations, this PR showed little overhead.
This change uses [reader-biased mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of
built-in RWMutex to avoid shared lock contention.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue May 24, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue May 25, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue May 29, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue May 29, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue May 30, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue May 30, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue May 30, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
ijsong added a commit that referenced this issue Jun 1, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 1, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 4, 2023
This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
ijsong added a commit that referenced this issue Jun 4, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 4, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 4, 2023
### What this PR does

This PR changes the Append RPC lifestyle from a unary to a bidirectional stream. However, it neither
adds nor updates stream-styled Append. Therefore, this change is transparent for users.

This is starting point to support streaming Append API, and LocalOrderClient mentioned on #433.
ijsong added a commit that referenced this issue Jun 4, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 4, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 7, 2023
This patch changes the Append RPC handler to support pipelined requests and does not change the
client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional
goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased
mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock
contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we
can improve the existing Append API more efficiently
[using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current
implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks
such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for
pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 7, 2023
### What this PR does

This patch changes the Append RPC handler to support pipelined requests and does not change the client's API. Therefore, users can use Append API transparently.

Supporting pipelined requests can lead to overhead since it is necessary to have additional goroutines and concurrent queues. To lower additional overhead, this change uses [reader-biased mutex](https://github.com/puzpuzpuz/xsync#rbmutex) instead of built-in RWMutex to avoid shared lock contention. As a result of experimentations, this PR showed very little overhead. Furthermore, we can improve the existing Append API more efficiently [using a long-lived stream](https://grpc.io/docs/guides/performance/#general): the current implementation creates a new stream whenever calling Append API, which leads to unnecessary tasks such as RPC initiation. We can reuse long-lived streams by changing client API. See this issue at #458.

### Which issue(s) this PR resolves

This PR implements server-side parts of LogStreamAppender mentioned in #433. It also can be used for pipelining generic Append RPC said in #441.
ijsong added a commit that referenced this issue Jun 7, 2023
This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a
particular log stream.

Resolves #433.
ijsong added a commit that referenced this issue Jun 7, 2023
### What this PR does

This PR adds LogStreamAppender to the client, and it is a client to append asynchronously only a particular log stream.

### Which issue(s) this PR resolves

Resolves #433
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment