Getting Names Of Proteins Around A Protein In A Genome
1
0
Entering edit mode
10.9 years ago
Pappu ★ 2.1k

I have a list of uniprot IDs of a protein in different species. Now I want to list the names of proteins around it (2 upstream and 2 downstream) in various genomes. Let me know if there is any tool for that. Thanks.

python • 1.8k views
ADD COMMENT
0
Entering edit mode

You mean "the names of proteins encoded by the genes around it." Upstream/downstream refers to genes, not proteins.

Also - what have you tried or searched for? There's no evidence that you've tried to address this problem yourself.

ADD REPLY
2
Entering edit mode
10.9 years ago
Raygozak ★ 1.4k

If you sort your gff file by chromosome and start position you could do the following in bash:

cat file.gff | awk -F'\t' '$3=="CDS"' | grep {protein_id} -A 2 -B 2

Hope this helps.

ADD COMMENT
0
Entering edit mode

Really nice and simple approach!

ADD REPLY
0
Entering edit mode

I am working with EMBL files.

ADD REPLY

Login before adding your answer.

Traffic: 2507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6