Too Many Protein Sequences To Align, Advice?
1
0
Entering edit mode
2.1 years ago
Jant • 0

I'm currently trying to do a multiple sequence alignment for a protein to look for conserved regions to later test any surface exposed ones for immunogenicity, but the protein in question has over 19 thousand entries in UniProt. I can't seem to find any tool that can do an MSA for this many and also show me the conserved regions, although I'm very new to bioinformatics so very possible I'm just missing things. Is there either a tool I can use for this (ClustalW on Galaxy gave up after about two days of processing, and that was the only thing I found so far that even accepted the fasta file) or is there any way I can somehow heavily trim down the number of sequences I'm aligning without losing any important information? Thanks in advance.

Fasta MSA • 878 views
ADD COMMENT
2
Entering edit mode
2.1 years ago

I think Clustal Omega (not ClustalW) should easily handle 19,000 proteins. Alternatively, you can reduce the number of query proteins by clustering highly similar sequences (for example using CD-HIT).

ADD COMMENT
0
Entering edit mode

Assuming I'm looking at the right place ( https://www.ebi.ac.uk/Tools/msa/clustalo/ ), it, unfortunately, can't accept over 4000 sequences or over 4MB in file size. The fasta file I'm trying to submit exceeds both of those by a fair bit unfortunately and it won't run it. I'll give CD-HIT a look and see how that works for me, though, thank you!

ADD REPLY
1
Entering edit mode

You will need to download Clustal Omega and run it from the command line.

ADD REPLY
0
Entering edit mode

It required some help from a friend who is much more tech-savvy than me, a person who had trouble even finding the button to make a post here, but I have gotten it working and got an alignment! However, I...don't actually know how to then get the conserved regions from it. I have tried using the MSA Viewer at NCBI but that choked and died on it, I've also tried a program called Gblocks that seemed promising but it just choked and died even faster. I'm starting to feel like I've bitten off more than I can chew on this.

ADD REPLY

Login before adding your answer.

Traffic: 2574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6