Brief descriptive text on future data pipeline state

NERC-CEH · Jul 18, 2024 · 9fa7756 · 9fa7756
1 parent e97cf8d
commit 9fa7756
Show file tree

Hide file tree

Showing 2 changed files with 5 additions and 1 deletion.
diff --git a/docs/diagrams/could_be/instrument_to_store.dot b/docs/diagrams/could_be/instrument_to_store.dot
@@ -11,7 +11,7 @@ digraph G {
   scope [shape=rect label="Microscope \n(FlowCam)"];
   pc [shape=rect label="Local PC"]
 
-  scope2 [shape=rect label="Microscope \n(Flow Cytometer"];
+  scope2 [shape=rect label="Laser imaging \n(Flow Cytometer)"];
   pc2 [shape=rect label="Local PC"]
 
   san [shape=cylinder label="SAN \nprivate cloud"]

diff --git a/docs/diagrams/index.md b/docs/diagrams/index.md
@@ -23,6 +23,10 @@ There are file naming conventions including metadata which doesn't follow the sa
 
 ### Could be 
 
+PC that drives the instrument is connected to the storage network, but not the internet (for security standards compliance reasons). What are the current precedents for either directly saving output to shared storage, or a watcher process that either pulls or pushes data from a lab PC to networked storage?
+
+Automated workflow (could be Apache Airflow or Beam based - FDRI project is trialling components) which watches for new source data, distributes the preprocessing with Dask or Spark if necessary, and publishes analysis-ready data _and metadata_ to cloud storage, continuously.
+
 <object data="could_be/instrument_to_store.svg" type="image/svg+xml">
 </object>