Question: How to get CDS.fasta from gff file + sequenceFile.fasta
1
gravatar for moranr
4.6 years ago by
moranr260
Ireland
moranr260 wrote:

I have two files :

  1. An annotation file in gff format
  2. A fasta formatted sequnce file for a whole genome.

I want to have a resulting file : myGenomeCDS.fasta

How can I go about doing this via python or Tools ?

Thanks

annotation sequence gff python • 2.8k views
ADD COMMENTlink modified 4.6 years ago by iraun3.6k • written 4.6 years ago by moranr260
5
gravatar for iraun
4.6 years ago by
iraun3.6k
Norway
iraun3.6k wrote:

If your gtf file has the annotation of CDS's you can extract CDS sequences using gffread in this way:

gffread -g genome.fa -x CDS.fa annotation.gtf

 

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by iraun3.6k
2
gravatar for arnstrm
4.6 years ago by
arnstrm1.7k
Ames, IA
arnstrm1.7k wrote:

There are so many ways to do this. If you don't want to write your own code, use the utilities from preexisiting packages such as BedTools (fastafrombed), Glimmer (extract) and even Galaxy as a tool (extract features).

ADD COMMENTlink written 4.6 years ago by arnstrm1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1966 users visited in the last hour