Skip to content

Commit

Permalink
Brief descriptive text on future data pipeline state
Browse files Browse the repository at this point in the history
  • Loading branch information
metazool committed Jul 18, 2024
1 parent e97cf8d commit 9fa7756
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
2 changes: 1 addition & 1 deletion docs/diagrams/could_be/instrument_to_store.dot
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ digraph G {
scope [shape=rect label="Microscope \n(FlowCam)"];
pc [shape=rect label="Local PC"]

scope2 [shape=rect label="Microscope \n(Flow Cytometer"];
scope2 [shape=rect label="Laser imaging \n(Flow Cytometer)"];
pc2 [shape=rect label="Local PC"]

san [shape=cylinder label="SAN \nprivate cloud"]
Expand Down
4 changes: 4 additions & 0 deletions docs/diagrams/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ There are file naming conventions including metadata which doesn't follow the sa

### Could be

PC that drives the instrument is connected to the storage network, but not the internet (for security standards compliance reasons). What are the current precedents for either directly saving output to shared storage, or a watcher process that either pulls or pushes data from a lab PC to networked storage?

Automated workflow (could be Apache Airflow or Beam based - FDRI project is trialling components) which watches for new source data, distributes the preprocessing with Dask or Spark if necessary, and publishes analysis-ready data _and metadata_ to cloud storage, continuously.

<object data="could_be/instrument_to_store.svg" type="image/svg+xml">
</object>

Expand Down

0 comments on commit 9fa7756

Please sign in to comment.