Is there a tool or website that can identify the gene and its regulatory role at a specified integration site on a chromosome (e.g., 1:20746689), and/or in addition along with its function (e.g., DNA binding activity, nucleosome binding activity)?
1) download a human GTF and convert it to XML+RDF with awk
2) download mapping about ensembl and GO from NCBI , join both resources and convert it to XML+RDF with awk
3) concatenate 1 and 2 to create a RDF database
4) query SPARQL with jena/arq
Thank you so much for the detailed guidance! Just a quick query - would I be running these scripts in a bash environment to replicate the results?
Also, regarding the integration site query, will this process be able to identify the regulatory role of the gene, such as whether the site falls within a promoter or enhancer region?
I have been using Linux environment but fairly new, still I'm eager to give them a try. Just to confirm, should I run the awk command you provided first like this:
awk -v BUILD=GRCh38 -f gtf2rdf.awk > output.rdf
Followed by executing the Makefile with:
make Makefile
I'm not quite sure how to proceed with executing the query.01.sparql afterward. Could you please provide guidance on this? Please correct me if I'm wrong. Appreciate your help.
Thank you so much for the detailed guidance! Just a quick query - would I be running these scripts in a bash environment to replicate the results?
Also, regarding the integration site query, will this process be able to identify the regulatory role of the gene, such as whether the site falls within a promoter or enhancer region?
yeah, I used sparql for fun but i you don't know them, you should use tools like
bedtools intersect
andjoin
....I have been using Linux environment but fairly new, still I'm eager to give them a try. Just to confirm, should I run the
awk
command you provided first like this:Followed by executing the Makefile with:
I'm not quite sure how to proceed with executing the
query.01.sparql
afterward. Could you please provide guidance on this? Please correct me if I'm wrong. Appreciate your help.just