-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* enable release diff in odk config * run release simple diff and post diff in PR * fix release artefact uri and diff filename * add edges, synonyms, xrefs and cl_terms reports to odk config * update custom reports to only cl * wip ontology content report * add cxg and hra numbers * revert default report sparql queries * create custom reports for CL only * add custom CL reports to odk config * script to generate content summary * add command to generate content summary as part of release * revert changes in edges, synonyms and xrefs reports * add instructions to add summary table in release notes * use cl-base to run diff * change file name and actions version * update to commit cl-base-diff and not full release diff * add report on diff between release and previous release The report is generated by the OAK diff command using the base releases for better comparison. The report shows new terms, new relationships, obsolete terms, changes on synonyms and definitions. The report is appended to the table with the ontology summary content and saved in the reports/summary_release.md file to be used as a release note. * update cl-release docs to reflect the new release process The release notes need to be updated and this commit explains how to fix a current issue in the OAK diff command when generating the report. * rewording explanation in readme Co-authored-by: Aleix Puig <[email protected]> * add missing dependency in prepare_content_summary goal The file is used in the rule, but it wasn't defined as a dependency, which could be used as an updated file. Co-authored-by: Nico Matentzoglu <[email protected]> * update the sparql queries for the custom reports Filter out the obsolete classes and the obsolete CP namespace from the queries not to count them in the custom reports and so to the ontology content summary report generated for the releases. * use cl-base.obo to generate robot release base diff We need to download the cl-base.obo to generate the output for the OAK diff command, so we can use the same artefact to generate the robot diff instead of downloading another artefact. This also adds the two dependencies for the `release-base-diff` target to make sure the files are updated. * improve the documentation about CL release workflow Update the link to the documentation about how to update the imports because the previous one was linking to an non-existing page. Change to inline code syntax instead of code block the GitHub release link because it was breaking the list numbers, making it to reset the numbering. Finally undo the change on the number of the last three items on the list as mistakenly done on the previous commit. --------- Co-authored-by: Anita Caron <[email protected]> Co-authored-by: Aleix Puig <[email protected]> Co-authored-by: Nico Matentzoglu <[email protected]>
- Loading branch information
1 parent
3d32380
commit 47e9202
Showing
14 changed files
with
313 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
""" Script to summarize content in an ontology """ | ||
import argparse | ||
from datetime import datetime | ||
|
||
import pandas as pd | ||
from rdflib import Graph | ||
|
||
|
||
class OntologyContentReport: | ||
"""Generic class for summarizing content in an ontology""" | ||
|
||
def __init__(self, ontology_iri, ont_namespace): | ||
""" | ||
Initialize the OntologyContentReport object. | ||
Args: | ||
ontology_iri (str): The IRI or filepath of the ontology to summarize. | ||
ont_namespace (str): The namespace of the ontology. | ||
""" | ||
self.ontology_iri = ontology_iri | ||
self.ont_namespace = ont_namespace | ||
self.g = self._init_graph(ontology_iri) | ||
self.date = datetime.now().strftime("%Y-%m-%d") | ||
self.nb_subclass_root = None | ||
self.nb_annotations = None | ||
self.nb_synonyms = None | ||
self.nb_references = None | ||
self.nb_def_references = None | ||
self.nb_relationships = None | ||
self.nb_cxg = None | ||
self.nb_hra = None | ||
|
||
def _init_graph(self, ontology_iri): | ||
""" | ||
Load the given ontology into a Graph object. | ||
Args: | ||
ontology_iri (str): The IRI or filepath of the ontology. | ||
Returns: | ||
rdflib.Graph: The loaded ontology graph. | ||
""" | ||
g = Graph() | ||
g.parse(ontology_iri, format="xml") | ||
return g | ||
|
||
def query(self, query): | ||
""" | ||
Execute a SPARQL query on the ontology graph. | ||
Args: | ||
query (str): The SPARQL query to execute. | ||
Returns: | ||
int: The count of query results. | ||
""" | ||
response = self.g.query(query) | ||
return response.bindings[0]["count"] | ||
|
||
def get_content_summary(self): | ||
""" | ||
Query the ontology graph to get the content summary. | ||
""" | ||
self.nb_subclass_root = self.query(f""" | ||
SELECT (COUNT (DISTINCT ?class) AS ?count) | ||
WHERE {{ | ||
?ont rdf:type owl:Ontology . | ||
?ont <http://purl.obolibrary.org/obo/IAO_0000700> ?root . | ||
?class rdfs:subClassOf* ?root . | ||
FILTER (STRSTARTS(STR(?class), "http://purl.obolibrary.org/obo/{self.ont_namespace}_")) | ||
}} | ||
""") | ||
|
||
self.nb_annotations = self.query(f""" | ||
SELECT (COUNT (?annotation) AS ?count) | ||
WHERE {{ | ||
?annotation rdf:type owl:AnnotationProperty . | ||
?class rdf:type owl:Class . | ||
?class ?annotation ?value . | ||
FILTER (STRSTARTS(STR(?class), "http://purl.obolibrary.org/obo/{self.ont_namespace}_")) | ||
}} | ||
""") | ||
|
||
self.nb_cxg = self.query(f""" | ||
SELECT (COUNT (?cxg) AS ?count) | ||
WHERE {{ | ||
?cxg rdf:type owl:Class . | ||
?cxg <http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/cl#cellxgene_subset> . | ||
FILTER (STRSTARTS(STR(?cxg), "http://purl.obolibrary.org/obo/{self.ont_namespace}_")) | ||
}} | ||
""") | ||
|
||
self.nb_hra = self.query(f""" | ||
SELECT (COUNT (?hra) AS ?count) | ||
WHERE {{ | ||
?hra rdf:type owl:Class . | ||
?hra <http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/uberon/core#human_reference_atlas> . | ||
FILTER (STRSTARTS(STR(?hra), "http://purl.obolibrary.org/obo/{self.ont_namespace}_")) | ||
}} | ||
""") | ||
|
||
self.nb_synonyms = self.count_report( | ||
self.load_report(f"{self.ont_namespace.lower()}-synonyms") | ||
) | ||
|
||
self.nb_relationships = self.count_report( | ||
self.load_report(f"{self.ont_namespace.lower()}-edges") | ||
) | ||
|
||
self.nb_references = self.count_report(self.load_report( | ||
f"{self.ont_namespace.lower()}-xrefs")["?xref"].unique() | ||
) | ||
|
||
self.nb_def_references = self.count_report( | ||
self.load_report( | ||
f"{self.ont_namespace.lower()}-def-xrefs" | ||
)["?xref"].unique() | ||
) | ||
|
||
def load_report(self, report_type): | ||
""" | ||
Load a report from a file. | ||
Args: | ||
report_type (str): The type of report to load. | ||
Returns: | ||
pandas.DataFrame: The loaded report data. | ||
""" | ||
return pd.read_csv(f"reports/{report_type}.tsv", sep="\t") | ||
|
||
def count_report(self, data): | ||
""" | ||
Count the number of rows in a report. | ||
Args: | ||
data (pandas.DataFrame): The report data. | ||
Returns: | ||
int: The number of rows in the report. | ||
""" | ||
return len(data) | ||
|
||
def prepare_report(self): | ||
""" | ||
Prepare the content summary report for printing. | ||
""" | ||
print(f"# Release Notes {self.date}") | ||
print("## Ontology content summary") | ||
|
||
summary_table = [ | ||
{ | ||
"Metric": "Number of subclasses of root", | ||
"Value": self.nb_subclass_root | ||
}, | ||
{ | ||
"Metric": f"Number of annotations on {self.ont_namespace} terms", | ||
"Value": self.nb_annotations | ||
}, | ||
{ | ||
"Metric": "Number of synonyms", | ||
"Value": self.nb_synonyms | ||
}, | ||
{ | ||
"Metric": "Number of unique references", | ||
"Value": self.nb_references | ||
}, | ||
{ | ||
"Metric": "Number of unique references in definitions", | ||
"Value": self.nb_def_references | ||
}, | ||
{ | ||
"Metric": f"Number of relationships with {self.ont_namespace} term as subject", | ||
"Value": self.nb_relationships | ||
}, | ||
{ | ||
"Metric": "Number of cellxgene classes", | ||
"Value": self.nb_cxg | ||
}, | ||
{ | ||
"Metric": "Number of HRA classes", | ||
"Value": self.nb_hra | ||
} | ||
] | ||
|
||
print(pd.DataFrame(summary_table).to_markdown(index=False)) | ||
|
||
|
||
if __name__ == "__main__": | ||
cli = argparse.ArgumentParser() | ||
cli.add_argument("--ontology_iri", type=str, help="IRI or filepath of ontology to summarize") | ||
cli.add_argument("--ont_namespace", type=str, help="Ontology namespace") | ||
|
||
args = cli.parse_args() | ||
|
||
report = OntologyContentReport(args.ontology_iri, args.ont_namespace) | ||
report.get_content_summary() | ||
report.prepare_report() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
prefix oio: <http://www.geneontology.org/formats/oboInOwl#> | ||
prefix owl: <http://www.w3.org/2002/07/owl#> | ||
prefix definition: <http://purl.obolibrary.org/obo/IAO_0000115> | ||
prefix xsd: <http://www.w3.org/2001/XMLSchema#> | ||
|
||
SELECT ?cls ?xref WHERE | ||
{ | ||
?cls definition: ?def . | ||
?ax a owl:Axiom; | ||
owl:annotatedSource ?cls; | ||
owl:annotatedProperty definition:; | ||
owl:annotatedTarget ?def; | ||
oio:hasDbXref ?xref . | ||
FILTER NOT EXISTS { ?cls owl:deprecated "true"^^xsd:boolean . } | ||
FILTER(isIRI(?cls) && STRSTARTS(str(?cls), "http://purl.obolibrary.org/obo/CL_") || STRSTARTS(str(?cls), "http://purl.obolibrary.org/obo/cl#")) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
prefix owl: <http://www.w3.org/2002/07/owl#> | ||
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | ||
prefix xsd: <http://www.w3.org/2001/XMLSchema#> | ||
|
||
SELECT ?x ?p ?y | ||
WHERE { | ||
{?x rdfs:subClassOf [ | ||
a owl:Restriction ; | ||
owl:onProperty ?p ; | ||
owl:someValuesFrom ?y ] | ||
} | ||
UNION { | ||
?x rdfs:subClassOf ?y . | ||
BIND(rdfs:subClassOf AS ?p) | ||
} | ||
?x a owl:Class . | ||
?y a owl:Class . | ||
FILTER NOT EXISTS { ?x owl:deprecated "true"^^xsd:boolean . } | ||
FILTER(isIRI(?x) && STRSTARTS(str(?x), "http://purl.obolibrary.org/obo/CL_") || STRSTARTS(str(?x), "http://purl.obolibrary.org/obo/cl#")) | ||
} |
Oops, something went wrong.