Grasp the 200bp gene sequence before all coding sequence in a bacteria genome
3.7 years ago
gaoyanwang • 0

Hi all,

I am a rookie in computational biology and am learning on using python to do some analysis. Could any one give me some advise in the following questions? Thanks a lot!!

In bacteria genome, the 200-300bp before coding sequence (CDS) are usually the regulatory region. There are about 5000 annotated CDS in a certain bacteria strain and I would like to output all the 5000 regulatory regions for the CDS. Is there a package or developed scripts to do so?

3.7 years ago
 bedtools flank
bedtools getfasta


Those are the basic steps, assuming you don't have splicing (if you do then you'll need to do some filtering).

Thank you very much Devon. I would not do any splicing. Would there be a tool to convert my .gb file to bed format? Thanks!

3.6 years ago
gaoyanwang • 0

