Question: Is there a way to get RNAFold output for multifasta in tabular format?
0
gravatar for adhirajnath14
26 days ago by
adhirajnath140 wrote:

I am trying to calculate the MFE along with the secondary structure for multifasta using RNAFold. The output generated is of the format.

>abc
GGCGGAGGUAGGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAG
(((((.....(((((((.(..((.......)).).)))))))........((((((.((............)).)))))).((((....))))))))).. (-35.80)
>lmn
GGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCA
(((((((.(..((.......)).).))))))).............((((..((((.........((((.(.((((....)))).).)))))))))))).. (-29.30)
>xyz
CGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCACCCCGGGACA
....(((........)))(((((............((((.(((((.((......((((.(.((((....)))).).)))))).))))).))))))))).. (-28.40)

Is there a way to get the output in tubular format with 1. Identifier, 2. sequence, 3. secondary structure and 4. MFE as columns? I have written regular expression scripts to capture each of the four and paste it in a file but I don't think that's an efficient way of doing it. Is there any other convenient way of doing it?

rnafold • 123 views
ADD COMMENTlink modified 25 days ago by JC8.8k • written 26 days ago by adhirajnath140

Regular expression capture groups is absolutely a valid way to do it, and probably the least hacky.

Otherwise, you would need to transliterate the \n characters to tabs, but since there are line wrappings, that will be much harder.

ADD REPLYlink written 26 days ago by Joe14k
1
gravatar for JC
25 days ago by
JC8.8k
Mexico
JC8.8k wrote:

In Perl:

#!/usr/bin/perl

use strict;
use warnings;

while (<>) {
    chomp;
    if (/>/) {
        s/>//;
        print "$_\t"; # just the seq id
    }
    elsif (/\((-\d+\.\d+)\)$/) {
        my $mfe = $1;
        s/ \($mfe\)//;
        print "$_\t$mfe\n"; # fold + MFE
    }
    else {
        print "$_\t"; # the seq
    }
}

Validation:

$ perl fold2tab.pl < fold.fa
abc     GGCGGAGGUAGGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAG    (((((.....(((((((.(..((.......)).).)))))))........((((((.((............)).)))))).((((....)))))))))..     -35.80
lmn     GGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCA    (((((((.(..((.......)).).))))))).............((((..((((.........((((.(.((((....)))).).))))))))))))..     -29.30
xyz     CGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCACCCCGGGACA    ....(((........)))(((((............((((.(((((.((......((((.(.((((....)))).).)))))).))))).)))))))))..     -28.40
ADD COMMENTlink modified 25 days ago • written 25 days ago by JC8.8k
1

One-liner:

$ perl -pe 's/\n/\t/g; s/>//; s/\s+/\t/; s/\(-/-/; s/\)\t$/\n/' < fold.fa
abc     GGCGGAGGUAGGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAG    (((((.....(((((((.(..((.......)).).)))))))........((((((.((............)).)))))).((((....)))))))))..    -35.80
lmn     GGGAGGCACGCGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCA    (((((((.(..((.......)).).))))))).............((((..((((.........((((.(.((((....)))).).))))))))))))..    -29.30
xyz     CGAUGGUAUUUCAGAGCCUCCCGAAUACAACUCCAGGGUAGGGUGUUGAAAGCGUUGGAGAUGUCUAAAGACACCGCCAGUACCACCCCACCCCGGGACA    ....(((........)))(((((............((((.(((((.((......((((.(.((((....)))).).)))))).))))).)))))))))..    -28.40
ADD REPLYlink written 25 days ago by JC8.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1381 users visited in the last hour