program to pull out 5'and 3'utr from cds file
0
0
Entering edit mode
8.0 years ago

So I got nucleotide fasta sequence compiled for a set of genes iam working on, and I need to pull out 5' and 3'utrs from them.

Before you ask I tried working with a source file containing entire database of utr, I was able to retrieve utr for around half of my geneset. many showed sequence being unavailable

From ncbi I was able to obtain the length of total gene sequence and cds and deduce the utr sequences

For example gene x:

1..6521             cds : 455..4453


so 5'utr:

1-454     3'utr : 4454-6521


I am going with this rationale

Can you suggest a program to retrieve the set number of nucleotides for each gene

utr cds • 2.5k views
0
Entering edit mode

If you have both files (a file with nucleotide sequences and a file with CDS localization) I could help you to write a script to extract UTR sequences for each gene.

0
Entering edit mode

Hi OP,

Any updates on your progress? I am also facing the same problems as you. Please let me know if you have found a solution to yours.

I have tried the UCSC table browser. Apparently my species is only arbitrarily annotated with -/+ 200bp to the start/end codon as UTRs.