Question: How to extract all promoter regions in multi-fasta format from genome using GFF?
1
gravatar for rimgubaev
2.1 years ago by
rimgubaev180
Russia/Moscow/Skoltech
rimgubaev180 wrote:

Hi Everyone,

How can I extract promoter sequences (ca. 1000bp upstream TSS) in multi-fasta format from genome (also multi-fasta with scaffolds) using information from corresponding GFF file? I've already tried to use GFF-Ex tool, however it didn't help (finished with errors). It is tobacco genome (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/715/135/GCF_000715135.1_Ntab-TN90/).

Does anyone know some other tools for this?

Thanks,

promoter gff fasta genome • 1.5k views
ADD COMMENTlink modified 19 months ago • written 2.1 years ago by rimgubaev180
4
gravatar for rimgubaev
19 months ago by
rimgubaev180
Russia/Moscow/Skoltech
rimgubaev180 wrote:

Finally, I've solved this problem by combining samtools, bedtools as well as custom R script. The pipline placed into bash script is available here.

ADD COMMENTlink modified 18 months ago • written 19 months ago by rimgubaev180
0
gravatar for shoujun.gu
2.1 years ago by
shoujun.gu370
Rockville/MD
shoujun.gu370 wrote:
  1. extract the gene id from GFF file
  2. fetch the promoter sequence from BioMart by using the gene id you extracted
ADD COMMENTlink written 2.1 years ago by shoujun.gu370

BioMart on Ensembl only appears to have Nicotiana attenuata genome but not the one OP likely wants.

ADD REPLYlink written 2.1 years ago by genomax78k

genomax is right there are no Nicotiana tabacum data on Ensembl.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by rimgubaev180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1989 users visited in the last hour