How to concatenate two gene MSA files with same organism to make a phylogeny of both he genes as one single concatenated phylogeny?
1
0
Entering edit mode
2.1 years ago
Ap1438 ▴ 50

all_rbcL.fasta:

MK60111.1
sequence
NC_110014.1
sequence
NC_110454.1
sequence
KT25472.1
sequence

all_matK.fasta:

NC_110014.1
sequence
MK60111.1
sequence
NC_110454.1
sequence
KT25472.1
sequence

And I want OUTPUT to be

NC_110014.1
sequence(matk)sequence(rbcl)
KT25472.1
sequence(matk)sequence(rbcl)

i.e. vertically one after another two gene sequences

I have 2 different multiple sequence alignment files of same organisms with ID as shown above.These 2 files are of 2 genes i.e. rbcl and matk and i want to concatenate both the genes i.e. rbcl+matk to make phylogeny.

I don't know how to concatenate both files with same organisms for these 2 genes .Can anyone help me on this?

grep concatenate awk • 1.2k views
ADD COMMENT
1
Entering edit mode
2.1 years ago
Mensur Dlakic ★ 27k

This perl script will do what you want. It is important that alignments have the same order and same names for sequences.

By the way, for two MSAs you can easily do it by hand by copying and pasting the sequences in the same order.

ADD COMMENT
0
Entering edit mode

Thank you for your valuable time and suggestion. The perl script is not working in my case because i have two different MSA gene files which are of different lengths (Same organisms in both the file i.e. 45 in total ). The perl script requires files to have equal length of the sequences.

I can do it manually now because i have 2 small file with 45 sp. entries but want to do it in command line so that if required can be done in command line in less time.

ADD REPLY
0
Entering edit mode

The perl script requires files to have equal length of the sequences.

The script requires all sequences within the same alignment to have the same length. Which they should, because differences in length get filled in by indel symbols. Different MSA files don't need to have the same length, but they do need to have the same group of sequences, and in the same order. I don't know if you actually tried the script, but if it didn't work for you it wasn't because MSA files had different lengths.

ADD REPLY
0
Entering edit mode

Yaa i got it there was some error in my file. But it didn't produce the desired output.Like i have mentioned above.

ADD REPLY

Login before adding your answer.

Traffic: 1330 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6