Extract amino acid sequence (with fasta header) from FGENESH tool : Python
1
0
Entering edit mode
7.3 years ago
nut_B ▴ 10

Hello,

Could everyone help me about How to extract amino acid (with fasta header) from FGENESH prediction tool?

The Output of FGENESH like this:

*FGENESH 4.0.0 Prediction of potential genes in Fusarium/Pezizomycotina genomic DNA
 Time    :   Tue Oct 11 18:40:21 2016
 Seq name: unitig_0|quiver|quiver 
 Length of sequence: 6684005 
 Number of predicted genes 1593: in +chain 768, in -chain 825.
 Number of predicted exons 5989: in +chain 2949, in -chain 3040.
 Positions of predicted genes and exons: Variant   1 from   1, Score:106181.523438 
   G Str   Feature   Start        End    Score           ORF           Len
   1 -      PolA      7273                3.25
   1 -    1 CDSl      7320 -      7345    2.65      7320 -      7343     24
   1 -    2 CDSi      7381 -      7399    2.20      7382 -      7399     18
   1 -    3 CDSi      7478 -      7493    8.21      7478 -      7492     15
   1 -    4 CDSf      7627 -      7655   -7.12      7629 -      7655     27
   1 -      TSS       8189                0.28
   2 +      TSS      15941               -2.21
   2 +    1 CDSf     16356 -     16371   -2.72     16356 -     16370     15
   2 +    2 CDSi     16722 -     16727    2.95     16724 -     16726      3
   2 +    3 CDSi     16786 -     16796    6.44     16788 -     16796      9
   2 +    4 CDSi     17219 -     17227    5.40     17219 -     17227      9
   2 +    5 CDSl     17495 -     17500    8.90     17495 -     17500      6
   2 +      PolA     17534                3.25
Predicted protein(s):
>FGENESH:[mRNA]   1   4 exon (s)   7320  -   7655    90 bp, chain -
atggcagggtggctaacgggaagtgttaggatagagttaacgttgaaaagagcaagctat
aattttagcgcgcaggtattgtacaagtaa
>FGENESH:   1   4 exon (s)   7320  -   7655    29 aa, chain -
MAGWLTGSVRIELTLKRASYNFSAQVLYK
>FGENESH:[mRNA]   2   5 exon (s)  16356  -  17500    48 bp, chain +
atgaataagcgtaaaatgaaaggcaaaaatattctaaaaacggcataa
>FGENESH:   2   5 exon (s)  16356  -  17500    15 aa, chain +
MNKRKMKGKNILKTA*

But I would like to extract only amino acid with some position such as; I would like to get only amino acid (in position 16356-17500) and amino acid sequence. Like this:

>FGENESH:   2   5 exon (s)  16356  -  17500    15 aa, chain +
MNKRKMKGKNILKTA

Could anyone can suggest me in python script?

Thank you advance,

python FGENESH aminoacid • 2.0k views
ADD COMMENT
1
Entering edit mode
7.3 years ago

grep line starting with '>' and print one line After the match

   grep '^>' -A 1--no-group-separator  input.txt
ADD COMMENT
0
Entering edit mode

Thank you very much :)

ADD REPLY
0
Entering edit mode

please flag this as answered (green flag on the left)

ADD REPLY
0
Entering edit mode

Sorry, Could you please explain more about how to set your answered to answer? I do not know, How to set it to answered?

ADD REPLY

Login before adding your answer.

Traffic: 2542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6