Add FileSystem (FS) cloudProvider option to importer #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Story
As a Hedera developer, I would like to have a way for the hedera-mirror-node to ingest local, hedera-services, live-generated data without the need to push/pull it through a S3 compatible file-service.
Architecture
To make local file-system loading of rcd files possible, a new
hedera.mirror.importer.downloader.cloudProvider
value calledFS
(short forFileSystem
) has been defined. Setting this up will most likely require that thehedera.mirror.importer.network
value be set toLOCAL
so that the mirror-node runtime can pick-up local-system changes having implicit-root file-path set to/opt/hedera/services/data
, which is the defaulthedera-services
docker path when spinning up a local hedera-network. This root file-path can be customized via thehedera.mirror.importer.downloader.bucketName
property.The following diagram ilustrates the proposed architecture for this feature:
The core contribution relies on refactoring the Importer > Downloader >
S3AsyncClient
dependency into a hierarchy based on common functionality starting from aFileClient
interface that provides simpledownload
andlist
-ing capabilities ontop of which an abstractMultiFileClient
class is defined to allow for bulk-file downloads ofStreamFilename
s that match a provided filter predicate. Currently,MultiFileClient
allows todownloadSignatureFiles
.Ontop of
MultiFileClient
we add aParameterizedFileClient
which is basically aMultiFileClient
aware of Spring injected configuration with common property exports such asrootPath
orpathPrefixFor
(to get the node-dependent location prefix).ParameterizedFileClient
branches into aS3FileClient
and aLocalFileClient
that actually do the file(s) specific retrieval work.On the other end, downloading through a
FileClient
now returns a generic instance ofPendingDownload
which, through a newDownloadResult
interface, abstracts away S3 SDK retrieved-object responses (S3PendingDownload
) and local-file downloads (seePendingDownload.SimpleResultForwarder
).Running it
Have your project-root
application.yml
configured to use theFS
(FileSystem
) cloud provider targeting aLOCAL
network:You can provide a different path for the importer to operate on via the
hedera.mirror.importer.downloader.bucketName
which defaults to/opt/hedera/services/data
.Also, you might want to decrease the record check-up frequency (the
hedera.mirror.downloader.record.frequency
) so that the downloader won't put too much pressure on the operating system host.