Extract amino acid sequence (with fasta header) from FGENESH tool : Python
1
0
4.8 years ago
nut_B ▴ 10

Hello,

Could everyone help me about How to extract amino acid (with fasta header) from FGENESH prediction tool?

The Output of FGENESH like this:

*FGENESH 4.0.0 Prediction of potential genes in Fusarium/Pezizomycotina genomic DNA
Time    :   Tue Oct 11 18:40:21 2016
Seq name: unitig_0|quiver|quiver
Length of sequence: 6684005
Number of predicted genes 1593: in +chain 768, in -chain 825.
Number of predicted exons 5989: in +chain 2949, in -chain 3040.
Positions of predicted genes and exons: Variant   1 from   1, Score:106181.523438
G Str   Feature   Start        End    Score           ORF           Len
1 -      PolA      7273                3.25
1 -    1 CDSl      7320 -      7345    2.65      7320 -      7343     24
1 -    2 CDSi      7381 -      7399    2.20      7382 -      7399     18
1 -    3 CDSi      7478 -      7493    8.21      7478 -      7492     15
1 -    4 CDSf      7627 -      7655   -7.12      7629 -      7655     27
1 -      TSS       8189                0.28
2 +      TSS      15941               -2.21
2 +    1 CDSf     16356 -     16371   -2.72     16356 -     16370     15
2 +    2 CDSi     16722 -     16727    2.95     16724 -     16726      3
2 +    3 CDSi     16786 -     16796    6.44     16788 -     16796      9
2 +    4 CDSi     17219 -     17227    5.40     17219 -     17227      9
2 +    5 CDSl     17495 -     17500    8.90     17495 -     17500      6
2 +      PolA     17534                3.25
Predicted protein(s):
>FGENESH:[mRNA]   1   4 exon (s)   7320  -   7655    90 bp, chain -
atggcagggtggctaacgggaagtgttaggatagagttaacgttgaaaagagcaagctat
aattttagcgcgcaggtattgtacaagtaa
>FGENESH:   1   4 exon (s)   7320  -   7655    29 aa, chain -
MAGWLTGSVRIELTLKRASYNFSAQVLYK
>FGENESH:[mRNA]   2   5 exon (s)  16356  -  17500    48 bp, chain +
atgaataagcgtaaaatgaaaggcaaaaatattctaaaaacggcataa
>FGENESH:   2   5 exon (s)  16356  -  17500    15 aa, chain +
MNKRKMKGKNILKTA*


But I would like to extract only amino acid with some position such as; I would like to get only amino acid (in position 16356-17500) and amino acid sequence. Like this:

>FGENESH:   2   5 exon (s)  16356  -  17500    15 aa, chain +
MNKRKMKGKNILKTA


Could anyone can suggest me in python script?

1
4.8 years ago

grep line starting with '>' and print one line After the match

   grep '^>' -A 1--no-group-separator  input.txt

0
Thank you very much :)

0
0
