Phenotype and organism model references for a large list of genes
1
0
Entering edit mode
11 months ago
storm1907 ▴ 30

Hi all. I have to generate table for about 3500 genes with following columns:

1) references to each gene; 2) pathological features 3) reference to the organism model, e.g. MGI number for mouse

Are there any databases/softwares, which takes long gene list as input and outputs if not all, then at least one of desired output type (list above)?

I am really frustrated, because doing this manually, gene by gene, is devastating

Thank you in advance!

database • 423 views
ADD COMMENT
0
Entering edit mode
11 months ago

human, using mondo : https://github.com/monarch-initiative/mondo

mkdir -p WORK/apache-jena-4.8.0/
wget -O "WORK/jeter.zip" "https://dlcdn.apache.org/jena/binaries/apache-jena-4.8.0.zip"
(cd /SCRATCH-BIRD/users/lindenbaum-p/work/NEXTFLOW/2023/20230429.hs38me.mondo.rdf.splarql/work && unzip jeter.zip && rm jeter.zip)
wget -O "WORK/mondo.owl" "https://github.com/monarch-initiative/mondo/releases/latest/download/mondo.owl"
WORK/apache-jena-4.8.0/bin/tdbloader --loc=WORK/TDB/ WORK/mondo.owl
WORK/apache-jena-4.8.0/bin/tdbquery --loc=WORK/TDB/ --query=query.sparql  --results=TSV
?gene_name  ?gene   ?property_label ?property   ?entity_label   ?entity
"RIT1"  <http://identifiers.org/hgnc/10023> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "Noonan syndrome 8" <http://purl.obolibrary.org/obo/MONDO_0014143>
"RYR2"  <http://identifiers.org/hgnc/10484> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "arrhythmogenic right ventricular dysplasia 2"  <http://purl.obolibrary.org/obo/MONDO_0010975>
"SCN1A" <http://identifiers.org/hgnc/10585> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "migraine, familial hemiplegic, 3"  <http://purl.obolibrary.org/obo/MONDO_0012320>
"SCN1B" <http://identifiers.org/hgnc/10586> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "Brugada syndrome 5"    <http://purl.obolibrary.org/obo/MONDO_0013015>
"SCN1B" <http://identifiers.org/hgnc/10586> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "atrial fibrillation, familial, 13" <http://purl.obolibrary.org/obo/MONDO_0014155>
"SCN2B" <http://identifiers.org/hgnc/10589> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "atrial fibrillation, familial, 14" <http://purl.obolibrary.org/obo/MONDO_0014156>
"SCN4B" <http://identifiers.org/hgnc/10592> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "long QT syndrome 10"   <http://purl.obolibrary.org/obo/MONDO_0012737>
"SCN5A" <http://identifiers.org/hgnc/10593> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "progressive familial heart block, type 1A" <http://purl.obolibrary.org/obo/MONDO_0007240>
"SCN5A" <http://identifiers.org/hgnc/10593> "has material basis in germline mutation in"    <http://purl.obolibrary.org/obo/RO_0004003> "Brugada syndrome 1"    <http://purl.obolibrary.org/obo/MONDO_0011001>

with query.sparql:

    prefix owl: <http://www.w3.org/2002/07/owl#>
    prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    prefix mondo: <http://purl.obolibrary.org/obo/mondo#>
    prefix RO: <http://purl.obolibrary.org/obo/RO_>
    PREFIX obo: <http://purl.obolibrary.org/obo/>

    SELECT DISTINCT
        ?gene_name ?gene
        ?property_label ?property
        ?entity_label  ?entity
    WHERE 
    {


      ?entity rdfs:subClassOf [ rdf:type owl:Restriction ;
      owl:onProperty ?property ;
      owl:someValuesFrom ?gene ] . 
      ?entity rdfs:label ?entity_label .


      ?entity rdfs:subClassOf* obo:MONDO_0004995 .


     OPTIONAL {
      ?property rdfs:label ?property_label .
     }

     OPTIONAL {
      ?gene rdfs:label ?gene_name .
     }


    FILTER (isIRI(?entity) && STRSTARTS(str(?entity), "http://purl.obolibrary.org/obo/MONDO_"))
    FILTER (isIRI(?gene) && regex(str(?gene), "hgnc"))
    }

(see also:

)

ADD COMMENT

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6