Skip to content

Commit

Permalink
Merge pull request #78 from simleo/more_queries
Browse files Browse the repository at this point in the history
Add SPARQL queries for CQ7, CQ8, CQ9
  • Loading branch information
simleo authored May 31, 2024
2 parents 3285eb1 + 7187c02 commit a6a6350
Show file tree
Hide file tree
Showing 5 changed files with 182 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ id | CQ description | Existing/new terms | Rationale | Profile[^1] | Issue # |
CQ5 | How long does this workflow component take to run? | [totalTime](http://schema.org/totalTime)? Allowed on [HowTo](http://schema.org/HowTo) and [HowToDirection](http://schema.org/HowToDirection) but not on [HowToStep](http://schema.org/HowToStep). Can also get actual duration from [endTime](http://schema.org/endTime) - [startTime](http://schema.org/startTime) on the action | If a workflow step is computationally expensive, I may need to get an estimate for impatient users, or show a warning | 1, 3 | [~~13~~](https://github.com/ResearchObject/workflow-run-crate/issues/13) |
CQ6 | How long does this workflow take to run? | [totalTime](http://schema.org/totalTime). Can also get actual duration from [endTime](http://schema.org/endTime) - [startTime](http://schema.org/startTime) on the action | Same as CQ5, but with the full workflow | 2, 3 | [~~14~~](https://github.com/ResearchObject/workflow-run-crate/issues/14) |
CQ7 | Was the execution successful? | [actionStatus](http://schema.org/actionStatus) to [FailedActionStatus](http://schema.org/FailedActionStatus) or [CompletedActionStatus](http://schema.org/CompletedActionStatus) - can also provide [error](http://schema.org/error) | Needed to know whether or not retrieve the results | 1, 2, 3 | [~~15~~](https://github.com/ResearchObject/workflow-run-crate/issues/15) |
CQ8 | What are the inputs and outputs of the overall workflow (I don't care about the intermediate results) | [object](http://schema.org/object) and [result](http://schema.org/result) on the workflow run action | High level representation of the workflow execution | 2, 3 | [~~16~~](https://github.com/ResearchObject/workflow-run-crate/issues/16) |
CQ9 | What is the source code version of the component executed in a workflow step? Is it a script? and executable? | [softwareVersion](http://schema.org/softwareVersion), though getting the version of the actual tool (e.g., `grep`) that was called by the wrapper might not be easy | Knowing which release/software version was used (reproducibility) | 1, 3 | [~~17~~](https://github.com/ResearchObject/workflow-run-crate/issues/17) |
CQ8 | What are the inputs and outputs of the overall workflow? | [object](http://schema.org/object) and [result](http://schema.org/result) on the workflow run action | High level representation of the workflow execution | 2, 3 | [~~16~~](https://github.com/ResearchObject/workflow-run-crate/issues/16) |
CQ9 | What is the source code version of the component executed in a workflow step? | [softwareVersion](http://schema.org/softwareVersion), though getting the version of the actual tool (e.g., `grep`) that was called by the wrapper might not be easy | Knowing which release/software version was used (reproducibility) | 1, 3 | [~~17~~](https://github.com/ResearchObject/workflow-run-crate/issues/17) |
CQ10 | What is the script used to wrap up a software component? | We're mapping tool wrappers (e.g., `foo.cwl`) to [SoftwareApplication](http://schema.org/SoftwareApplication). Wrappers at lower levels can also be `SoftwareApplication`, but we need to draw the line somewhere | Many executables are complicated, and need an additional script to wrap them up or simplify. For example a "run.sh" script that exposes a simpler set of parameters and fixes another set. | 3 | [~~18~~](https://github.com/ResearchObject/workflow-run-crate/issues/18) |
CQ11 | How were workflow parameters used in tool runs? | We're linking tool params directly (with [connectedTo](http://schema.org/connectedTo)), but that's inaccurate since those links only exist within a workflow. | Knowing how workflow parameters were passed to individual tools to find out how they affected the outputs | 3 | [~~25~~](https://github.com/ResearchObject/workflow-run-crate/issues/25) |

Expand Down
30 changes: 30 additions & 0 deletions docs/sparql/cq7.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
"""\
This script contains the SPARQL query for Competency Question 7 "Was the
execution successful?". In the discussion on
https://github.com/ResearchObject/workflow-run-crate/issues/15 we decided to
represent this by adding an "actionStatus" property to actions, and consider
an execution successful if its value is "CompletedActionStatus" and not
successful if the value is "FailedActionStatus".
"""

import rdflib
from pathlib import Path

CRATE = Path("crate")

g = rdflib.Graph()
g.parse(CRATE/"ro-crate-metadata.json")

QUERY = """\
PREFIX s: <http://schema.org/>
SELECT ?action ?status
WHERE {
?action a s:CreateAction .
?action s:actionStatus ?status .
}
"""

qres = g.query(QUERY)
for row in qres:
print(f"{row.action}, {row.status}")
53 changes: 53 additions & 0 deletions docs/sparql/cq8.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
"""\
This script contains the SPARQL query for Competency Question 8 "What are the
inputs and outputs of the overall workflow?". In the discussion on
https://github.com/ResearchObject/workflow-run-crate/issues/16 we identified
them as the "object" and "result" of the action corresponding to the
workflow's execution.
"""

import rdflib
from pathlib import Path

CRATE = Path("crate")

g = rdflib.Graph()
g.parse(CRATE/"ro-crate-metadata.json")

QUERY = """\
PREFIX s: <http://schema.org/>
PREFIX bioschemas: <https://bioschemas.org/>
SELECT ?obj
WHERE {
?action a s:CreateAction .
?workflow a bioschemas:ComputationalWorkflow .
?action s:instrument ?workflow .
OPTIONAL { ?action s:object ?obj } .
}
"""

qres = g.query(QUERY)
print("INPUTS")
print("======")
for row in qres:
print(row.obj)

QUERY = """\
PREFIX s: <http://schema.org/>
PREFIX bioschemas: <https://bioschemas.org/>
SELECT ?res
WHERE {
?action a s:CreateAction .
?workflow a bioschemas:ComputationalWorkflow .
?action s:instrument ?workflow .
OPTIONAL { ?action s:result ?res } .
}
"""

qres = g.query(QUERY)
print("OUTPUTS")
print("=======")
for row in qres:
print(row.res)
32 changes: 32 additions & 0 deletions docs/sparql/cq9.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
"""\
This script contains the SPARQL query for Competency Question 9 "What is the
source code version of the component executed in a workflow step?". In
https://github.com/ResearchObject/workflow-run-crate/pull/42 we ended up using
"softwareVersion" with a fallback on "version" on the "SoftwareApplication"
entity, which is used both in Process Run Crates and Provenance Run Crates for
individual tools.
"""

import rdflib
from pathlib import Path

CRATE = Path("process_run_crate")

g = rdflib.Graph()
g.parse(CRATE/"ro-crate-metadata.json")

QUERY = """\
PREFIX s: <http://schema.org/>
SELECT ?name ?version
WHERE {
?app a s:SoftwareApplication .
?app s:name ?name .
OPTIONAL { ?app s:softwareVersion ?version } .
OPTIONAL { ?app s:version ?version } .
}
"""

qres = g.query(QUERY)
for row in qres:
print(row.name, row.version)
65 changes: 65 additions & 0 deletions docs/sparql/process_run_crate/ro-crate-metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{
"@context": "https://w3id.org/ro/crate/1.1/context",
"@graph": [
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"conformsTo": {"@id": "https://w3id.org/ro/crate/1.1"},
"about": {"@id": "./"}
},
{
"@id": "./",
"@type": "Dataset",
"conformsTo": {"@id": "https://w3id.org/ro/wfrun/process/0.1"},
"hasPart": [
{"@id": "pics/2017-06-11%2012.56.14.jpg"},
{"@id": "pics/sepia_fence.jpg"}
],
"mentions": {"@id": "#SepiaConversion_1"},
"name": "My Pictures"
},
{
"@id": "https://w3id.org/ro/wfrun/process/0.1",
"@type": "CreativeWork",
"name": "Process Run Crate",
"version": "0.1"
},
{
"@id": "https://www.imagemagick.org/",
"@type": "SoftwareApplication",
"url": "https://www.imagemagick.org/",
"name": "ImageMagick",
"softwareVersion": "6.9.7-4"
},
{
"@id": "#SepiaConversion_1",
"@type": "CreateAction",
"name": "Convert dog image to sepia",
"description": "convert -sepia-tone 80% test_data/sample/pics/2017-06-11\\ 12.56.14.jpg test_data/sample/pics/sepia_fence.jpg",
"endTime": "2018-09-19T17:01:07+10:00",
"instrument": {"@id": "https://www.imagemagick.org/"},
"object": {"@id": "pics/2017-06-11%2012.56.14.jpg"},
"result": {"@id": "pics/sepia_fence.jpg"},
"agent": {"@id": "https://orcid.org/0000-0001-9842-9718"}
},
{
"@id": "pics/2017-06-11%2012.56.14.jpg",
"@type": "File",
"description": "Original image",
"encodingFormat": "image/jpeg",
"name": "2017-06-11 12.56.14.jpg (input)"
},
{
"@id": "pics/sepia_fence.jpg",
"@type": "File",
"description": "The converted picture, now sepia-colored",
"encodingFormat": "image/jpeg",
"name": "sepia_fence (output)"
},
{
"@id": "https://orcid.org/0000-0001-9842-9718",
"@type": "Person",
"name": "Stian Soiland-Reyes"
}
]
}

0 comments on commit a6a6350

Please sign in to comment.