Off topic:Miranda Output Parsing
1
2
Entering edit mode
14.4 years ago
Patrick ▴ 220

Hello everyone,

I am using miranda and would like to write a perl parser to extract the regions I am putting in Bold in what is following I hope you can guide me throught writing it :

here is the output (I only want to extract the bold tagged part that means that I have a match between my miRna and cDNA (output is trancated, the * TEXT * is supposed to be BOLD text Its not working inside code tags here)

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing Scan: 5102 vs ENST00000403423
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Score for this Scan:
No Hits Found above Threshold
Complete

Read Sequence:ENST00000356294 cdna:pseudogene chromosome:GRCh37:1:248246965:248247898:-1 gene:ENSG00000197067(930 nt)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing Scan: 5102 vs ENST00000356294
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Score for this Scan:
No Hits Found above Threshold
Complete

Read Sequence:ENST00000431838 cdna:pseudogene chromosome:GRCh37:1:247996651:247997590:-1 gene:ENSG00000230576(940 nt)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing Scan: **5102** vs **ENST00000431838**
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

   Forward:    Score: 140.000000  Q:2 to 25  R:**393 to 418** Align Len (23) (56.52%) (69.57%)

   Query:    3' caGGCACGAACGGAUCGCUUGAGUGg 5'
                  || |||   : |: | |:||||| 
   Ref:      5' taCCCTGCAATTATGACCAGCTCACt 3'

   Energy:  **-16.430000 kCal/Mol**

Scores for this hit:
>5102    ENST00000431838    140.00    -16.43    2 25    393 418    23    56.52%    69.57%

Score for this Scan:
Seq1,Seq2,Tot Score,Tot Energy,Max Score,Max Energy,Strand,Len1,Len2,Positions
>>5102    ENST00000431838    140.00    -16.43    140.00    -16.43    28    26    940     393
Complete

Read Sequence:ENST00000438288 cdna:pseudogene chromosome:GRCh37:1:247830189:247831143:1 gene:ENSG00000230411(955 nt)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing Scan: 5102 vs ENST00000438288
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Score for this Scan:
No Hits Found above Threshold
Complete

Read Sequence:ENST00000438881 cdna:known chromosome:GRCh37:17:4835625:4838274:1 gene:ENSG00000185245(2339 nt)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing Scan: **5102** vs **ENST00000438881**
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

   Forward:    Score: 147.000000  Q:2 to 25  R:**615 to 641** Align Len (24) (66.67%) (70.83%)

   Query:    3' caGGCACGA-ACGGAUCGCUUGAGUGg 5'
                  || | ||   || || ||||||:| 
   Ref:      5' caCCCTTCTCCTCCAAGAGAACTCGCt 3'

   Energy:  **-20.080000 kCal/Mol**

Scores for this hit:
>5102    ENST00000438881    147.00    -20.08    2 25    615 641    24    66.67%    70.83%


   Forward:    Score: 143.000000  Q:4 to 25  R:444 to 471 Align Len (23) (73.91%) (82.61%)

   Query:    3' caGGCACGAAC--GGAUCGCUUGAGugg 5'
                  || ||| ||  |:|:||||||||   
   Ref:      5' tgCCCTGCGTGGTCTTGGCGAACTCcaa 3'

   Energy:  -28.320000 kCal/Mol

Scores for this hit:
>5102    ENST00000438881    143.00    -28.32    4 25    444 471    23    73.91%    82.61%

Score for this Scan:
Seq1,Seq2,Tot Score,Tot Energy,Max Score,Max Energy,Strand,Len1,Len2,Positions
>>5102    ENST00000438881    290.00    -48.40    147.00    -28.32    30    26    2339     615 444
Complete

Read Sequence:ENST00000329125 cdna:known chromosome:GRCh37:17:4835592:4838325:1 gene:ENSG00000185245(2501 nt)

Thank you

Pat

perl parsing • 4.9k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2470 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6