Question: How do I get data that maps the protein domain of the Interpro database to the hg19 genome coordinates?
0
gravatar for eric.kai0918
14 months ago by
eric.kai09180 wrote:

Hi,

How do I get data that maps the protein domain of the Interpro database to the hg19 genome coordinates? For example, I want to these format; Chromosome, StartPosition, EndPosition, ProteinDomain

Thanks.

interpro domain • 523 views
ADD COMMENTlink modified 14 months ago by Pierre Lindenbaum112k • written 14 months ago by eric.kai09180
0
gravatar for Pierre Lindenbaum
14 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

I've written http://lindenb.github.io/jvarkit/MapUniProtFeatures.html

but I've not used it since I've written it.

$ java  -jar dist/mapuniprot.jar \
    -R /path/to/human_g1k_v37.fasta \
    -u /path/uri/uniprot.org/uniprot_sprot.xml.gz  \
    -k <(curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz" | gunzip -c | awk -F '        ' '{if($2 ~ ".*_.*") next; OFS="       "; gsub(/chr/,"",$2);print;}'   ) |\
    LC_ALL=C sort -t '  ' -k1,1 -k2,2n -k3,3n  | uniq | head


1   69090   69144   topological_domain  1000    +   69090   69144   255,0,0 1   54  0
1   69144   69216   transmembrane_region    1000    +   69144   69216   255,0,0 1   72  0
1   69216   69240   topological_domain  1000    +   69216   69240   255,0,0 1   24  0
1   69240   69306   transmembrane_region    1000    +   69240   69306   255,0,0 1   66  0
1   69306   69369   topological_domain  1000    +   69306   69369   255,0,0 1   63  0
1   69357   69636   disulfide_bond  1000    +   69357   69636   255,0,0 1   279 0
1   69369   69429   transmembrane_region    1000    +   69369   69429   255,0,0 1   60  0
1   69429   69486   topological_domain  1000    +   69429   69486   255,0,0 1   57  0
1   69486   69543   transmembrane_region    1000    +   69486   69543   255,0,0 1   57  0
1   69543   69654   topological_domain  1000    +   69543   69654   255,0,0 1   111 0
ADD COMMENTlink written 14 months ago by Pierre Lindenbaum112k

Thank you for your answer.

ADD REPLYlink written 14 months ago by eric.kai09180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1555 users visited in the last hour