extract promoter sequence from whole genome
0
0
Entering edit mode
3.8 years ago
citronxu ▴ 20

Hi there,

I'm right now working with Brassica napus, and would like to extract promoter sequence of certain genes on Linux platform.

First I look through the brassica napus database, Ensembl and genoscope, found the information of each predicted genes (cDNA and polypeptide sequences), yet no information on promoter regions.

Then I downloaded the whole genome info (Brassica napus_v4.1_chromosomes_fa.gz from genoscope), and intend to retrive promoter sequences from it. what I have include the genes position (for example, in which chromosome they are located and from which to which postion they span), and the whole genome sequences. What commands can I use to get to position of genes, in which flanking sequences are also shown so that I could be able to copy sequences of 1000 bps upstream the start codon.

What I tried is using 'zless' to read the file combining with command 'grep' + detail info on genes (for instance, chromosome number), but it did not work.

Welcome any recommandations and suggestions.

Many thanks in advance!

sequence • 846 views
ADD COMMENT
1
Entering edit mode

Extracting promoters is a non-exact science.

Broadly speaking, if you want to ~1000bp upstream of every gene feature, you can do this quite easily with BioPython or similar. It would be easier to do this from a genbank or similar, where you don't need to know the feature coordinates a priori.

If you do have this information, it can still be done, but requires being a bit more direct. Can you please show what your input data actually looks like? (the coordinates file).

ADD REPLY
0
Entering edit mode

HI, thank you for the reply. I feel so sorry for my stupid question :/, ya, after I saw you comment I then turned to check NCBI database and found exact prometer sequences recorded...

ADD REPLY

Login before adding your answer.

Traffic: 1682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6