Building A Consensus Sequence From A Set Of Sequences
4
2
Entering edit mode
12.8 years ago

I have a list of around 50 pdb files/fasta sequences (they do not belong to any family). I badly need to build up 10-15 consensus sequences from them representing sets of PDB files. I have used the web servers ClustalW and Consensus could not really understand the results. Please help.

Rishika CARLBio group

multiple clustalw consensus bioperl • 15k views
ADD COMMENT
3
Entering edit mode
12.8 years ago
Pals ★ 1.3k

I have once found the program PAGAN to be useful in this sort of cases. You could go through the Manuscript first. Also Codoncode aligner can address your problem.

ADD COMMENT
0
Entering edit mode

Thanks but cant install PAGAN

ADD REPLY
0
Entering edit mode

Probably because it is targeted to debian based linux environment. It would be really difficult if not impossible if you are using other linux distributions.

ADD REPLY
0
Entering edit mode

You need libboost libraries installed: sudo apt-get install libboost-dev libboost-program-options1.42-dev libboost-regex1.42-dev

ADD REPLY
0
Entering edit mode

If you have problems installing PAGAN, please contact the author and ask for help. You can find the contact details on the program web site.

PAGAN requires two Boost packages (available for all platforms, including OSX and Windows) but shouldn't then compile fine.

ADD REPLY
0
Entering edit mode

If you have problems installing PAGAN, please contact the author and ask for help. You can find the contact details on the program web site. PAGAN requires two Boost packages (available for all platforms, including OSX and Windows) but should then compile fine.

ADD REPLY
1
Entering edit mode
12.8 years ago
Neilfws 49k

If what you want to do is cluster the sequences into groups and choose a representative of each group, look no further than CD-HIT. It has a web server too, if you don't want to run it locally.

ADD COMMENT
0
Entering edit mode

I have just now used the CD-HIT web server. But I want to input my sequences in fasta format, cluster the sequences into groups and choose a representative of each group, as you said correctly. Can I do that with the help of CD-HIT? Please reply. Thank you so much for your help.

ADD REPLY
0
Entering edit mode

Well yes, you can - as I said in the answer.

ADD REPLY
0
Entering edit mode

Well yes, you can - as I said in the answer. I've used the standalone program to do exactly that; I have not used the server, but assume it can do the same.

ADD REPLY
0
Entering edit mode

Thanks a tonne....could do that after installing the G++ compiler. :):)

ADD REPLY
1
Entering edit mode
12.8 years ago
brentp 24k

After generating your clusters, look here for how to generate a consensus sequence.

Every time someone asks about consensus sequences, I recommend motility. It has a C++ and a python interface.

ADD COMMENT
0
Entering edit mode

Will this program consider protein sequences also?

ADD REPLY
0
Entering edit mode

But will this program consider protein sequence also?

ADD REPLY
0
Entering edit mode

The CD-HIT worked out....

ADD REPLY
0
Entering edit mode

Thanks for the response.

ADD REPLY
0
Entering edit mode

@Kisun, good point. You'd have to convert to nucleotide sequence to use motility.

ADD REPLY
0
Entering edit mode
12.8 years ago
Assa Yeroslaviz ★ 1.8k

How about doing it with R?

the Biostring package is very helpfull and easy to understand. have a look at the vignette of Biostring

ADD COMMENT

Login before adding your answer.

Traffic: 2421 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6