Question: How do I get data that maps the protein domain of the Interpro database to the hg19 genome coordinates?
0
gravatar for eric.kai0918
16 months ago by
eric.kai09180 wrote:

Hi,

How do I get data that maps the protein domain of the Interpro database to the hg19 genome coordinates? For example, I want to these format; Chromosome, StartPosition, EndPosition, ProteinDomain

Thanks.

interpro domain • 583 views
ADD COMMENTlink modified 16 months ago by Pierre Lindenbaum114k • written 16 months ago by eric.kai09180
0
gravatar for Pierre Lindenbaum
16 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum114k wrote:

I've written http://lindenb.github.io/jvarkit/MapUniProtFeatures.html

but I've not used it since I've written it.

$ java  -jar dist/mapuniprot.jar \
    -R /path/to/human_g1k_v37.fasta \
    -u /path/uri/uniprot.org/uniprot_sprot.xml.gz  \
    -k <(curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz" | gunzip -c | awk -F '        ' '{if($2 ~ ".*_.*") next; OFS="       "; gsub(/chr/,"",$2);print;}'   ) |\
    LC_ALL=C sort -t '  ' -k1,1 -k2,2n -k3,3n  | uniq | head


1   69090   69144   topological_domain  1000    +   69090   69144   255,0,0 1   54  0
1   69144   69216   transmembrane_region    1000    +   69144   69216   255,0,0 1   72  0
1   69216   69240   topological_domain  1000    +   69216   69240   255,0,0 1   24  0
1   69240   69306   transmembrane_region    1000    +   69240   69306   255,0,0 1   66  0
1   69306   69369   topological_domain  1000    +   69306   69369   255,0,0 1   63  0
1   69357   69636   disulfide_bond  1000    +   69357   69636   255,0,0 1   279 0
1   69369   69429   transmembrane_region    1000    +   69369   69429   255,0,0 1   60  0
1   69429   69486   topological_domain  1000    +   69429   69486   255,0,0 1   57  0
1   69486   69543   transmembrane_region    1000    +   69486   69543   255,0,0 1   57  0
1   69543   69654   topological_domain  1000    +   69543   69654   255,0,0 1   111 0
ADD COMMENTlink written 16 months ago by Pierre Lindenbaum114k

Thank you for your answer.

ADD REPLYlink written 16 months ago by eric.kai09180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1169 users visited in the last hour