Question: How to extract all promoter regions in multi-fasta format from genome using GFF?
1
gravatar for rimgubaev
5 months ago by
rimgubaev40
rimgubaev40 wrote:

Hi Everyone,

How can I extract promoter sequences (ca. 1000bp upstream TSS) in multi-fasta format from genome (also multi-fasta with scaffolds) using information from corresponding GFF file? I've already tried to use GFF-Ex tool, however it didn't help (finished with errors). It is tobacco genome (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/715/135/GCF_000715135.1_Ntab-TN90/).

Does anyone know some other tools for this?

Thanks,

promoter gff fasta genome • 383 views
ADD COMMENTlink modified 10 days ago • written 5 months ago by rimgubaev40
3
gravatar for rimgubaev
10 days ago by
rimgubaev40
rimgubaev40 wrote:

Finally, I've solved this problem by combining samtools, bedtools as well as custom R script. The pipline placed into bash script here.

ADD COMMENTlink modified 10 days ago • written 10 days ago by rimgubaev40
0
gravatar for shoujun.gu
5 months ago by
shoujun.gu340
Rockville/MD
shoujun.gu340 wrote:
  1. extract the gene id from GFF file
  2. fetch the promoter sequence from BioMart by using the gene id you extracted
ADD COMMENTlink written 5 months ago by shoujun.gu340

BioMart on Ensembl only appears to have Nicotiana attenuata genome but not the one OP likely wants.

ADD REPLYlink written 5 months ago by genomax51k

genomax is right there are no Nicotiana tabacum data on Ensembl.

ADD REPLYlink modified 5 months ago • written 5 months ago by rimgubaev40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 984 users visited in the last hour