Get CDS from WGS fasta OR download just CDS
1
0
Entering edit mode
9.1 years ago
moranr ▴ 290

Hi,

I have many CDS for species genomes I wish to download. I can get 70% with a script. However, there are some I need to get manually and am having trouble finding the coding sequence (CDS) not WGS.

For example, ftp://ftp.ncbi.nih.gov/genomes/Acromyrmex_echinatior has many dirs/files, and I have this entire WGS genome in a fasta file, but don't know how to get the CDS.

Can someone please assist me?

Thanks,
R

ftp CDS genome ncbi • 3.1k views
ADD COMMENT
2
Entering edit mode
9.1 years ago
mark.ziemann ★ 1.9k

There is a genome annotation file in the GFF folder (ref_Aech_3.9_top_level.gff3) which contains the positional information for genes, mRNAs, exons and CDS. If you use a tool like grep to find all the accessions of the CDS, you can fetch these accession numbers (i.e. gene=LOC105143314) from the rna.fa file.

ADD COMMENT

Login before adding your answer.

Traffic: 2011 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6