Question: How to extract all promoter regions in multi-fasta format from genome using GFF?
1
gravatar for rimgubaev
7 months ago by
rimgubaev70
rimgubaev70 wrote:

Hi Everyone,

How can I extract promoter sequences (ca. 1000bp upstream TSS) in multi-fasta format from genome (also multi-fasta with scaffolds) using information from corresponding GFF file? I've already tried to use GFF-Ex tool, however it didn't help (finished with errors). It is tobacco genome (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/715/135/GCF_000715135.1_Ntab-TN90/).

Does anyone know some other tools for this?

Thanks,

promoter gff fasta genome • 502 views
ADD COMMENTlink modified 10 weeks ago • written 7 months ago by rimgubaev70
4
gravatar for rimgubaev
10 weeks ago by
rimgubaev70
rimgubaev70 wrote:

Finally, I've solved this problem by combining samtools, bedtools as well as custom R script. The pipline placed into bash script is available here.

ADD COMMENTlink modified 6 weeks ago • written 10 weeks ago by rimgubaev70
0
gravatar for shoujun.gu
7 months ago by
shoujun.gu340
Rockville/MD
shoujun.gu340 wrote:
  1. extract the gene id from GFF file
  2. fetch the promoter sequence from BioMart by using the gene id you extracted
ADD COMMENTlink written 7 months ago by shoujun.gu340

BioMart on Ensembl only appears to have Nicotiana attenuata genome but not the one OP likely wants.

ADD REPLYlink written 7 months ago by genomax55k

genomax is right there are no Nicotiana tabacum data on Ensembl.

ADD REPLYlink modified 7 months ago • written 7 months ago by rimgubaev70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1404 users visited in the last hour