Logic To Get Consensus Sequence
0
0
Entering edit mode
10.5 years ago
SRKR ▴ 180

I have a set of aligned sequences in fasta format. I want to get consensus out of the alignment. In case of most of the sites one of the base is showing maximum occurrence. In case of sites where two or more bases occur equal number of times, which base should be taken. An example is given below:

Seq_1: ATGCGA
Seq_2: AT-CGT
Seq_3: AT-CCG
Seq_4: AT-CCC
Seq_5: AA-CT-

As per the conventions this will be the consensus

Consensus : A T G C [G/C] N

But this output of the consensus sequence will throw an error when aligned with other sequences. So what should be done in such scenario and how to get consensus for such sites?

consensus genomics • 2.5k views
ADD COMMENT
2
Entering edit mode

Depending on what you want to do downstream, you might be able to use IUPAC codes, such as S for [G/C].

ADD REPLY
0
Entering edit mode

I can use IUPAC codes, but those are just being ignored by the application thus affecting the alignment. I am using MEGA 4.0. Also even if the application takes random base based on the letter, that would be technically a glitch.

ADD REPLY
0
Entering edit mode

Ah, you should really update your question to mention MEGA 4.0 and the other details of exactly what you're doing. Otherwise, you'll only ever get a rather generic reply like mine. With more details, hopefully someone familiar with MEGA can provide some insight into this.

ADD REPLY

Login before adding your answer.

Traffic: 2559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6