Skip to content

Latest commit

 

History

History
55 lines (36 loc) · 1.43 KB

README.md

File metadata and controls

55 lines (36 loc) · 1.43 KB

Varmovinator

The world's best (and first) variant learning model validator (outside CAGI challenges).

For a given variant, this will pull basic information from standard bioinformatic databases such as

  • disease association 1,2
  • gene ontology 2
  • uniprot information 2
  • environmental factors that influence penetrance
  • putative molecular consequences 1
  • putatative clinical consequences 1
  • population frequency 1
  • protein effects 2
  • gene 2

1: Opencravat 2: Uniprot 3: LitVar

The data will be pulled using SPARQL. An example SPARQL query is included.

We have selected the following three models to test:

  1. BioGPT
  2. StabilityLM (Stable Vicuna 13-b)
  3. GPT4 3a. GPT4 prompted with a KG

Questions we are asking the model:

  • What diseases is rsX associated with?
  • Expand the text rsX
  • What environmental factors affect the penetrance of rsX?
  • What transcription factor pathways are affected by rsX?
X's tested in this case:
  • rs6003
  • rs80357914
  • rs563410947

We will come up with a model to

  • tag discrepant assertions
  • calculate a consistency score

Note: some answers are non-sensical, or at least decontextualized; indicating that some models likely need to be at least prompted