How to extract 3', 5' UTR sequences from genbank records using python, PERL and R code?
2
0
Entering edit mode
16 months ago

Hello All, I have 1000 sequences of genebank records, I want to extract the only 3'UTR, 5'UTR sequences from the sequences and to store in excel format. Share your ideas and suggestion [using PERL or Python or R codes]

UTR • 698 views
ADD COMMENT
0
Entering edit mode

Hi, please post a sample gbk file and define the headers that you want to see in your output file (Ex: seqID, locusTag, sequence ... ).

ADD REPLY
0
Entering edit mode

Please take a look at the biopython cookbook and tutorial.

ADD REPLY
0
Entering edit mode
16 months ago
padwalmk ▴ 100

Hi, It's unclear wither you have the gff file with you or fasta.

You can look in to following post

Extract coordinates of upstream region up to closest coding region in R

ADD COMMENT
0
Entering edit mode
16 months ago
zubenel ▴ 110

If you have gff file you might try to use gff2fasta.pl with option -feature set as "five_prime_UTR" or "three_prime_UTR" or something like that. Also you may read how to get sequences of specific features with BioPerl.

ADD COMMENT

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6