extracting the exon sequence of genomic region
0
0
Entering edit mode
9.8 years ago
firoz.imtech ▴ 50

Dear All,

I am new in BioPerl. I run e-PCR version 2.3.11 on a genomic sequence, and got a output file "seq1.epcr" by following command:

./e-PCR -w9 -f 1 -m5000 test.sts  WS240.genomic.fa  T=3 >seq1.epcr

"seq1.epcr" has eight columns with tab and looks like:

Chr   STS_name   strand    start    end     length/5000-5000    gap  mismatch
I       FOR_F32H2.2     +       8966315 8966961 647/5000-5000   0       0
I       FOR_Y54E10BR.d  -       3028477 3031091 2615/5000-5000  0       0
III     FOR_B0280.1.v5  +       7133931 7135112 1182/5000-5000  0       0

Now, I want to extract the amplicon sequences in fasta format from "WS240.genomic.fa" according to STS hits result "seq1.epcr".

However, Amplicon sequences should contains:

  1. Only exon sequence
  2. If primer hits on non-exon (intron) region, take only exon sequence and write that "forward or reverse" primer hit the intron region.

Could you please tell me how can I use GFF3 annotation file in Bio::Tools::EPCR to extract my amplicon sequences or any other methods to do the same?

Note: I have also loaded the "GFF3 and Genomic sequence" in mysql database using bp_seqfeature_load.pl

Thanks
Firoz

genome sequence • 2.2k views
ADD COMMENT
1
Entering edit mode

What about trying existing tools?

Try gff2fasta

ADD REPLY

Login before adding your answer.

Traffic: 1821 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6