Multiple sequence alignment over all 3'UTRs of a genome
1
1
Entering edit mode
6.2 years ago
aramis.1994 ▴ 10

Hi all,

I am looking for a piece of advice related to a efficient and accurate way of obtaining a multiple sequence alignment of all the 3'UTRs of a genome. I'm aiming to detect conserved and repetitive patterns that could be targets for regulatory elements, but I am encountering memory limitations (classic multiple alignment tools -such as Muscle- cannot handle files that contain more than 500 sequences or their size is over 10 Mb).

I would hugely appreciate any ideas or feedback.

Thank you very much in advance!

3-prime-UTR genome multiple-sequence-alignment alignment • 1.6k views
ADD COMMENT
0
Entering edit mode

You could try to use CD-Hit to remove redundancy from your sequences in order to reduce the data set size.

ADD REPLY
4
Entering edit mode
6.2 years ago

Personally I don't think that trying to align all 3'UTR of genes in a genome will result in anything decent! There is way too much variation (== too little conservation) to result in a sensible multiple alignment.

Moreover, simply doing a global alignment is probably technically not even feasible. Lengths of the 3'UTR are too variable and if you're looking for motifs those are too small (insignificant) to be visible in a global multiple alignment.

A better approach (= what most people will do ) is to first group genes (UTRs) in biological significant clusters (eg. co-expression, same pathway, ...) and then try to analyze the UTRs in those smaller groups. Even so, the simple MSA approach will then not even give much I assume. You're better trying software specifically for that purpose (= motif detection), I'm thinking of phylogenetic footprinting, phylogenetic shadowing, motif sampling, RSA-tools ....

ADD COMMENT

Login before adding your answer.

Traffic: 2110 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6