How to convert a STK alignment into CLUSTAL?
1
0
Entering edit mode
17 months ago
diego1530 ▴ 60

Dear Bioinformatics,

I'm want to let you know that I'm stuck with a issue and it's as follows: My goal is predict RNA secondary structures in Hepatitis virus and I want to employ RNALalifold from the ViennaRNA package with default parameters, to determine the boundaries of locally stable structures within each MSA, and realigned these local regions using mLocARNA.Then the significance and conservation of the found structures will be evaluated with RNAz. Keeping in mind the above, the RNALalifold output is an alignment in Stockholm format as this:

# STOCKHOLM 1.0
#=GF ID aln_142_184
#=GF SS RNALalifold prediction
HEPATV_1     AACGTGAGTTTAGTAAAACCAACGGTTTACGTCTACTCGCGTG
HEPATV_2     AACGTGAGTTTAGTAAAACCAACGGTTTACGTCTACTCGCGTG
HEPATV_3     AACGTGAGTTTAGTAAAACCAACGGTTTACGTCTACTCGCGTG
#=GC RF       AACGUGAGUUUAGUAAAACCAACGGUUUACGUCUACUCGCGUG
#=GC SS_cons  .((((((((...(((((.((...)))))))....)))))))).
//


Note that at the bottom is the RNA boundaries of locally stable structures.

Now, I need to convert this alignment into CLUSTAL format, i.e within of the extension "aln", which it would serve as input in Mlocarna tool. This software only supports alignments with the aln extension (CLUSTAL) with its respective RNA boundaries at the bottom. For this reason, I would like to contact you if you know any tool, script or method to convert this alignment from STK to CLUSTAL with its respective RNA boundaries?

I will be very appreciative of any help you can give me

Boundaries structure RNA • 771 views
0
Entering edit mode

Hi

Maybe someone who knows exactly what you are talking about could help you with a tool, script or method.

That being said, just in case no one else responds, could you provide an example what your input is and what your output should look like, please? Perhaps you know of a tutorial that has the inputs and outputs you could paste here.

I am asking this because it may be possible to just do this with terminal commands such as awk, grep, cut ... I could give it a try?

-Pratik

0
Entering edit mode

Hi.

The output of the RNALalifold program is as shown in the example above and the input I need for Mlocarna is like this:

fruA               --CCUCGAGGGGAACCCGAA-------------AGGGACCCGAGAGG--
vhuU               AGCUCACAACCGAACCCAUU-------------UGGGAGGUUGUGAGCU
fdhA               CGCCACCCUGCGAACCCAAUAUAAAAUAAUACAAGGGAGCAG-GUGGCG
#A1                ..*...........CCC.............................5..
#S                 ((((((.((((...(((.................))).)))).))))))

0
Entering edit mode

Hmm... I still don't understand the conversion completely... Maybe someone else could chime in?

1
Entering edit mode
17 months ago
Mensur Dlakic ★ 21k

To the best of my knowledge, locarna will take the Stockholm file without any modification. If you don't have a version that does that, it would be worth installing the latest.

There is a program called esl-reformat in easel library and also in HMMer package that converts between various sequence formats, including Stockholm => Clustal. It will not retain the secondary structure assignment from the very last line, but locarna will work without it. Alternatively, you can always add the last line by hand after converting to Clustal.

0
Entering edit mode

Hi Mensur,

I would like to clarify that Mlocarna only supports FASTA and CLUSTAL format. It's a pity that it does not include a STK format. I appreciate your contribution, but in my analysis it is very necessary to include the secondary structure assignment in the last line, at the moment I havent t seen that it's possible.