Question: Alignment & Conserved Element
0
gravatar for bailliecharles
3.8 years ago by
bailliecharles0 wrote:

Hi All,

I'm very new to bioinformatics, but am keen to learn (I need to)! I'm hoping someone can at least point me in the right direction as learning resources I have come across are either very basic or way too technical.

I want to identify conserved noncoding elements within crustacean genomes for use in a phylogenomic study. Here's what I was thinking: pairwise alignment of two genomes, filter out conserved regions, remove duplicates, BLAST results against other crustacean genomes as a kind of validation (maybe other arthropods too). So, my questions are:

  1. Am I on the right track?

  2. How to do an alignment? I have done many before but only very short regions, never a whole genome. Is this even possible or do I need to break it down? The assemblies I have found appear to be in draft form so if I do cut them into manageable chunks how to do I know a particular set of contigs is the same in both species?

  3. How do I know which of the final set of conserved elements are non-coding if there is no reference to use?

Any help hugely appreciated!
C

ADD COMMENTlink modified 14 months ago by RamRS25k • written 3.8 years ago by bailliecharles0
0
gravatar for abascalfederico
3.8 years ago by
abascalfederico1.1k
Spain
abascalfederico1.1k wrote:

I don't think this kind of noncoding conserved elements are long enough to BLAST them successfully against a third genome. The best way may be to have a multiple genome alignment, but this can be very complicate to build. Alternatively, you could build different pairwise genome alignments. This is complicate too, but less. I would use LAST for the pairwise alignments, but you should post-process the results to identify the best hits from the millions of alignments. I did this once and took me a lot of time to write and tune the scripts. It's a hard problem to start with! I hope someone else devise an easier way of doing this. Are these genomes in UCSC? If they are, the UCSC may have the pairwise alignments already calculated. It's much easier to work with mammalian than with crustacean genomes!

ADD COMMENTlink modified 15 months ago by RamRS25k • written 3.8 years ago by abascalfederico1.1k

Thanks, I'll have a look at LAST. Unfortunately they aren't on UCSC, and unfortunately I'm working with crustaceans (one of the few it would seem!).

ADD REPLYlink modified 15 months ago by RamRS25k • written 3.8 years ago by bailliecharles0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1802 users visited in the last hour