-
Notifications
You must be signed in to change notification settings - Fork 3
Provenance Capture in vsr2.xsl
Tim L edited this page May 10, 2015
·
102 revisions
- Prizms provides csv2rdf4lod-automation, SPARQL endpoint, and metadata
- The provenance uses the organizing principles from csv2rdf4lod-automation
- vsr2.xsl implements the provenance capture.
- The technique described on this page is also discussed in our COLD 2013 submission.
- Ashley Clark's very nice discussion about provenance in XSL.
- Jun Zhao's and Jeni Tennison's OPMV XSLT Module.
- Install Prizms on a fresh VM, backed by the ieeevis github repo.
prizms/bin/install.sh --me http://tw.rpi.edu/instances/TimLebo --my-email [email protected] --proj-user ieeevis --repos [email protected]:timrdf/ieeevis.git --upstream-ckan --our-base-uri http://aquarius.tw.rpi.edu/projects/ieeevis --our-source-id tw-rpi-edu --our-datahub-id twc-ieeevis
- Retrieve some RDF, situating it within an "SDV organization"
-
cr-dcat-retrieval-url.sh (and commit the access metadata)
lebot@ieeevis:~/prizms/ieeevis/data/source$ cr-dcat-retrieval-url.sh tbl-foaf http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf
git add -f dig-csail-mit-edu/tbl-foaf/access.ttl
git commit -m 'tbl foaf access.ttl'; git push
- pull repo as ieeevis
-
cr-retrieve.sh according to the dcat access metadata.
ieeevis@ieeevis:~/prizms/ieeevis/data/source$ git pull; cr-retrieve.sh -w --skip-if-exists
-
cr-dcat-retrieval-url.sh (and commit the access metadata)
- Invoke vsr2grf.sh for {rdf, custom} visual strategy x {graffle, svg, graphml} target formats
- If invoked from within a conversion cockpit, feed the dataset URI in to the transform.
- Feed
VSR_PROVENANCE
in to the transform. - Logging is set up, populated, then tore down within XSL at:
- log:new($visual-artifact-uri) will [instantiate a Java instance](Invoking Java from XSLT using saxonb9 1 0 8j) of class xmlns:log="java:edu.rpi.tw.visualization.log.VisualizationDecisions"
- Strategy handlers first report their arrival upon a subject or triple:
- Strategy handlers then report their decision about the subject or triple:
- log:export($log) flushes the Java instance's log to dump file.
- If logging is enabled, disable xsl:messages in the explain.xsl functions.
- add finish() to vsr2.xsl and dump Sesame repository to stderr (instead of to sesame server).
- add log calls for edge and node invocations; review arrive/explain log calls.
- review graphic element annotations
- Rerun with different visual strategies
- Rerun vsr2grf.sh to different concrete target graphical formats:
- Publish files to the web (e.g. foaf.rdf.graffle)
cr-ln-to-www-root.sh source/foaf.rdf* manual/foaf.rdf.graffle*
-
aggregate-source-rdf into SPARQL endpoint.
aggregate-source-rdf.sh source/foaf.rdf.pml.ttl manual/foaf.rdf.graffle.prov.ttl
- publish graffle/svg/graphml file to www - make sure its location is referenced in the provenance.
ieeevis@ieeevis:~/prizms/ieeevis/data/source/dig-csail-mit-edu/tbl-foaf/version/2011-Feb-01$ vsr2grf.sh rdf graffle -w -od manual source/foaf.rdf
The OmniGraffle file know what contents it is visualizing:
bash-3.2$ vsr2grf.sh rdf graffle -w http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf; grddl.sh foaf.rdf.graffle | grep "depicts>"
Transforming foaf.rdf to ./foaf.rdf.graffle
<http://open.vocab.org/terms/depicts> <http://creativecommons.org/licenses/by-nc/3.0/>;
<http://open.vocab.org/terms/depicts> <http://xmlns.com/foaf/0.1/PersonalProfileDocument>;
<http://open.vocab.org/terms/depicts> <http://www.w3.org/People/Berners-Lee/card#i>;
<http://open.vocab.org/terms/depicts> <http://xmlns.com/foaf/0.1/Person>;
<http://open.vocab.org/terms/depicts> <http://www.koalie.net/foaf.rdf>;
<http://open.vocab.org/terms/depicts> <http://www.grorg.org/dean/foaf.rdf>;
<http://open.vocab.org/terms/depicts> <http://www.grorg.org/dean/>;
<http://open.vocab.org/terms/depicts> "mailto:[email protected]";
- SPARQL query for all vsr:Graphic in http://aquarius.tw.rpi.edu/projects/ieeevis/sparql and download the files.
- [GRDDL](Recovering Data from View) the files to produce new RDF graphs. (TODO: #timestamp the originating visual element and specialize the original)
- aggregate-source-rdf into SPARQL endpoint.
- SPARQL query for all depicted resources, and dereference them all. Publish as new dataset, which is a superset of the visualization derived datasubset.