Question

Aligning Short Reads To Homologous Sequences?

4

Entering edit mode

13.4 years ago

Goldbear ▴ 130

I'm working on a de novo transcriptome project and the genome has not been sequenced for our plant. In general, trinity is working pretty well, but we are seeing some spurious fusion genes (N terminus of gene A fused to C terminus of gene B). For the questions we are asking, having the correct full-length protein sequence is important.

I was wondering if we could get around this problem by first mapping short reads to homologous protein sequences from Arabidopsis and then feeding each of those subcollections into a contig builder? Is there any software that acts like blastx but works well on short Illumina reads?

Thanks

transcriptome short aligner • 3.2k views

ADD COMMENT • link updated 11.0 years ago by rob234king ▴ 610 • written 13.4 years ago by Goldbear ▴ 130

0

Entering edit mode

Hi, length of the reads and the divergence between your species and Arabidopsis will be very useful to know first.

ADD REPLY • link 13.4 years ago by Haibao Tang 3.0k

0

Entering edit mode

Hi, my reads are about 90bp. unfortunately the divergence is pretty far. Our strategy may be to fully sequence one representative species from our clade and then use that as a reference for the other species. Thanks!

ADD REPLY • link 13.3 years ago by Goldbear ▴ 130

score 2 · Answer 1 · 2011-06-28

2

Entering edit mode

13.4 years ago

Vitis ★ 2.5k

There is a blastx-based pipeline called STM (a Genome Research paper) which uses de novo contigs and blastx search results to proteins from a closely related species to improve de novo assembly. Maybe it's also not hard to put together a similar pipeline using perl, in which blastx outputs can be dissected more closely. However, I think choosing the right reference taxon should be a very important factor.

ADD COMMENT • link 13.4 years ago by Vitis ★ 2.5k

0

Entering edit mode

Thanks! This was exactly what I was looking for.

ADD REPLY • link 13.3 years ago by Goldbear ▴ 130

score 0 · Answer 2 · 2013-10-27

0

Entering edit mode

11.0 years ago

rob234king ▴ 610

There is a setting on trinity to attempt to address fusion transcripts, try again but use --jaccard_clip

ADD COMMENT • link 11.0 years ago by rob234king ▴ 610